A MAXIMUM-ENTROPY APPROACH TO OFF-POLICY EVALUATION IN AVERAGE-REWARD MDPS (English)
- New search for: Lazic, Nevena
- New search for: Yin, Dong
- New search for: Farajtabar, Mehrdad
- New search for: Levine, Nir
- New search for: Gorur, Dilan
- New search for: Harris, Chris
- New search for: Schuurmans, Dale
- New search for: Lazic, Nevena
- New search for: Yin, Dong
- New search for: Farajtabar, Mehrdad
- New search for: Levine, Nir
- New search for: Gorur, Dilan
- New search for: Harris, Chris
- New search for: Schuurmans, Dale
In:
34th Conference on Neural Information Processing Systems (NeurIPS 2020) ; Volume 15 of 27
; 12461-12471
;
2021
- Conference paper / Print
-
Title:A MAXIMUM-ENTROPY APPROACH TO OFF-POLICY EVALUATION IN AVERAGE-REWARD MDPS
-
Contributors:Lazic, Nevena ( author ) / Yin, Dong ( author ) / Farajtabar, Mehrdad ( author ) / Levine, Nir ( author ) / Gorur, Dilan ( author ) / Harris, Chris ( author ) / Schuurmans, Dale ( author )
-
Conference:NeurIPS ; 34. ; 2020 ; Online
-
Published in:
-
Publisher:
- New search for: Curran Associates, Inc.
-
Place of publication:Red Hook, NY
-
Publication date:2021
-
Type of media:Conference paper
-
Type of material:Print
-
Language:English
-
Source:
The tables of contents are generated automatically and are based on the data records of the individual contributions available in the index of the TIB portal. The display of the Tables of Contents may therefore be incomplete.
- 11676
-
CONSISTENT ESTIMATION OF IDENTIFIABLE NONPARAMETRIC MIXTURE MODELS FROM GROUPED OBSERVATIONSRitchie, Alexander / Vandermeulen, Robert A. / Scott, Clayton et al. | 2021
- 11687
-
MANIFOLD STRUCTURE IN GRAPH EMBEDDINGSRubin-Delanchy, Patrick et al. | 2021
- 11700
-
ADAPTIVE LEARNED BLOOM FILTER (ADA-BF): EFFICIENT UTILIZATION OF THE CLASSIFIER WITH APPLICATION TO REAL-TIME INFORMATION FILTERING ON THE WEBDai, Zhenwei / Shrivastava, Anshumali et al. | 2021
- 11711
-
MCUNET: TINY DEEP LEARNING ON IOT DEVICESLin, Ji / Chen, Wei-Ming / Lin, Yujun / Cohn, John / Gan, Chuang / Han, Song et al. | 2021
- 11723
-
IN SEARCH OF ROBUST MEASURES OF GENERALIZATIONDziugaite, Gintare Karolina / Drouin, Alexandre / Neal, Brady / Rajkumar, Nitarshan / Caballero, Ethan / Wang, Linbo / Mitliagkas, Ioannis / Roy, Daniel M. et al. | 2021
- 11734
-
TASK-AGNOSTIC EXPLORATION IN REINFORCEMENT LEARNINGZhang, Xuezhou / Ma, Yuzhe / Singla, Adish et al. | 2021
- 11744
-
MULTI-TASK ADDITIVE MODELS FOR ROBUST ESTIMATION AND AUTOMATIC STRUCTURE DISCOVERYWang, Yingjie / Chen, Hong / Zheng, Feng / Xu, Chen / Gong, Tieliang / Chen, Yanhong et al. | 2021
- 11756
-
PROVABLY EFFICIENT REWARD-AGNOSTIC NAVIGATION WITH LINEAR VALUE ITERATIONZanette, Andrea / Lazaric, Alessandro / Kochenderfer, Mykel J. / Brunskill, Emma et al. | 2021
- 11767
-
SOFTMAX DEEP DOUBLE DETERMINISTIC POLICY GRADIENTSPan, Ling / Cai, Qingpeng / Huang, Longbo et al. | 2021
- 11778
-
ONLINE DECISION BASED VISUAL TRACKING VIA REINFORCEMENT LEARNINGSong, Ke / Zhang, Wei / Song, Ran / Li, Yibin et al. | 2021
- 11789
-
EFFICIENT MARGINALIZATION OF DISCRETE AND STRUCTURED LATENT VARIABLES VIA SPARSITYCorreia, Gonçalo / Niculae, Vlad / Aziz, Wilker / Martins, André et al. | 2021
- 11803
-
DEEPI2I: ENABLING DEEP HIERARCHICAL IMAGE-TO-IMAGE TRANSLATION BY TRANSFERRING FROM GANSWang, Yaxing / Yu, Lu / Weijer, Joost Van De et al. | 2021
- 11816
-
DISTRIBUTIONAL ROBUSTNESS WITH IPMS AND LINKS TO REGULARIZATION AND GANSHusain, Hisham et al. | 2021
- 11828
-
A SHOOTING FORMULATION OF DEEP LEARNINGVialard, Francois-Xavier / Kwitt, Roland / Wei, Susan / Niethammer, Marc et al. | 2021
- 11839
-
CSI: NOVELTY DETECTION VIA CONTRASTIVE LEARNING ON DISTRIBUTIONALLY SHIFTED INSTANCESTack, Jihoon / Mo, Sangwoo / Jeong, Jongheon / Shin, Jinwoo et al. | 2021
- 11853
-
LEARNING IMPLICIT CREDIT ASSIGNMENT FOR COOPERATIVE MULTI-AGENT REINFORCEMENT LEARNINGZhou, Meng / Liu, Ziyu / Sui, Pengwei / Li, Yixuan / Chung, Yuk Ying et al. | 2021
- 11865
-
MATE: PLUGGING IN MODEL AWARENESS TO TASK EMBEDDING FOR META LEARNINGChen, Xiaohan / Wang, Zhangyang / Tang, Siyu / Muandet, Krikamol et al. | 2021
- 11878
-
RESTLESS-UCB, AN EFFICIENT AND LOW-COMPLEXITY ALGORITHM FOR ONLINE RESTLESS BANDITSWang, Siwei / Huang, Longbo / Lui, John C. S. et al. | 2021
- 11890
-
PREDICTIVE INFORMATION ACCELERATES LEARNING IN RLLee, Kuang-Huei / Fischer, Ian / Liu, Anthony / Guo, Yijie / Lee, Honglak / Canny, John / Guadarrama, Sergio et al. | 2021
- 11902
-
ROBUST AND HEAVY-TAILED MEAN ESTIMATION MADE SIMPLE, VIA REGRET MINIMIZATIONHopkins, Sam / Li, Jerry / Zhang, Fred et al. | 2021
- 11913
-
HIGH-FIDELITY GENERATIVE IMAGE COMPRESSIONMentzer, Fabian / Toderici, George D. / Tschannen, Michael / Agustsson, Eirikur et al. | 2021
- 11925
-
A STATISTICAL MECHANICS FRAMEWORK FOR TASK-AGNOSTIC SAMPLE DESIGN IN MACHINE LEARNINGKailkhura, Bhavya / Thiagarajan, Jayaraman / Li, Qunwei / Zhang, Jize / Zhou, Yi / Bremer, Timo et al. | 2021
- 11936
-
COUNTEREXAMPLE-GUIDED LEARNING OF MONOTONIC NEURAL NETWORKSSivaraman, Aishwarya / Farnadi, Golnoosh / Millstein, Todd / Broeck, Guy Van Den et al. | 2021
- 11949
-
A NOVEL APPROACH FOR CONSTRAINED OPTIMIZATION IN GRAPHICAL MODELSRouhani, Sara / Rahman, Tahrima / Gogate, Vibhav et al. | 2021
- 11961
-
GLOBAL CONVERGENCE OF DEEP NETWORKS WITH ONE WIDE LAYER FOLLOWED BY PYRAMIDAL TOPOLOGYNguyen, Quynh N. / Mondelli, Marco et al. | 2021
- 11973
-
ON THE TRADE-OFF BETWEEN ADVERSARIAL AND BACKDOOR ROBUSTNESSWeng, Cheng-Hsin / Lee, Yan-Ting / Wu, Shan-Hung et al. | 2021
- 11984
-
IMPLICIT GRAPH NEURAL NETWORKSGu, Fangda / Chang, Heng / Zhu, Wenwu / Sojoudi, Somayeh / Ghaoui, Laurent El et al. | 2021
- 11996
-
RETHINKING IMPORTANCE WEIGHTING FOR DEEP LEARNING UNDER DISTRIBUTION SHIFTFang, Tongtong / Lu, Nan / Niu, Gang / Sugiyama, Masashi et al. | 2021
- 12008
-
GUIDING DEEP MOLECULAR OPTIMIZATION WITH GENETIC EXPLORATIONAhn, Sungsoo / Kim, Junsu / Lee, Hankook / Shin, Jinwoo et al. | 2021
- 12022
-
TEMPORAL SPIKE SEQUENCE LEARNING VIA BACKPROPAGATION FOR DEEP SPIKING NEURAL NETWORKSZhang, Wenrui / Li, Peng et al. | 2021
- 12034
-
TSPNET: HIERARCHICAL FEATURE LEARNING VIA TEMPORAL SEMANTIC PYRAMID FOR SIGN LANGUAGE TRANSLATIONLi, Dongxu / Xu, Chenchen / Yu, Xin / Zhang, Kaihao / Swift, Benjamin / Suominen, Hanna / Li, Hongdong et al. | 2021
- 12046
-
NEURAL TOPOGRAPHIC FACTOR ANALYSIS FOR FMRI DATASennesh, Eli / Khan, Zulgqarnain / Wang, Yiyu / Hutchinson, J Benjamin / Satpute, Ajav / Dy, Jennifer / Meent, Jan-Willem Van De et al. | 2021
- 12057
-
NEURAL ARCHITECTURE GENERATOR OPTIMIZATIONRu, Robin / Esperanca, Pedro / Carlucci, Fabio Maria et al. | 2021
- 12070
-
A BANDIT LEARNING ALGORITHM AND APPLICATIONS TO AUCTION DESIGNNguyen, Kim Thang et al. | 2021
- 12080
-
METAPOISON: PRACTICAL GENERAL-PURPOSE CLEAN-LABEL DATA POISONINGHuang, W. Ronny / Geiping, Jonas / Fowl, Liam / Taylor, Gavin / Goldstein, Tom et al. | 2021
- 12092
-
SAMPLE EFFICIENT REINFORCEMENT LEARNING VIA LOW-RANK MATRIX ESTIMATIONShah, Devavrat / Song, Dogyoon / Xu, Zhi / Yang, Yuzhe et al. | 2021
- 12104
-
TRAINING GENERATIVE ADVERSARIAL NETWORKS WITH LIMITED DATAKarras, Tero / Aittala, Miika / Hellsten, Janne / Laine, Samuli / Lehtinen, Jaakko / Aila, Timo et al. | 2021
- 12115
-
DEEPLY LEARNED SPECTRAL TOTAL VARIATION DECOMPOSITIONGrossmann, Tamara G. / Korolev, Yury / Gilboa, Guy / Schoenlieb, Carola et al. | 2021
- 12127
-
FRACTRAIN: FRACTIONALLY SQUEEZING BIT SAVINGS BOTH TEMPORALLY AND SPATIALLY FOR EFFICIENT DNN TRAININGFu, Yonggan / You, Haoran / Zhao, Yang / Wang, Yue / Li, Chaojian / Gopalakrishnan, Kailash / Wang, Zhangyang / Lin, Yingyan et al. | 2021
- 12140
-
IMPROVING NEURAL NETWORK TRAINING IN LOW DIMENSIONAL RANDOM BASESGressmann, Frithjof / Eaton-Rosen, Zach / Luschi, Carlo et al. | 2021
- 12151
-
SAFE REINFORCEMENT LEARNING VIA CURRICULUM INDUCTIONTurchetta, Matteo / Kolobov, Andrey / Shah, Shital / Krause, Andreas / Agarwal, Alekh et al. | 2021
- 12163
-
LEVERAGE THE AVERAGE: AN ANALYSIS OF KL REGULARIZATION IN REINFORCEMENT LEARNINGVieillard, Nino / Kozuno, Tadashi / Scherrer, Bruno / Pietquin, Olivier / Munos, Remi / Geist, Matthieu et al. | 2021
- 12175
-
HOW ROBUST ARE THE ESTIMATED EFFECTS OF NONPHARMACEUTICAL INTERVENTIONS AGAINST COVID-19?Sharma, Mrinank / Mindermann, Sören / Brauner, Jan / Leech, Gavin / Stephenson, Anna / Gavenciak, Tomás / Kulveit, Jan / Teh, Yee Whye / Chindelevitch, Leonid / Gal, Yarin et al. | 2021
- 12187
-
BEYOND INDIVIDUALIZED RECOURSE: INTERPRETABLE AND INTERACTIVE SUMMARIES OF ACTIONABLE RECOURSESRawal, Kaivalya / Lakkaraju, Himabindu et al. | 2021
- 12199
-
GENERALIZATION ERROR IN HIGH-DIMENSIONAL PERCEPTRONS: APPROACHING BAYES ERROR WITH CONVEX OPTIMIZATIONAubin, Benjamin / Krzakala, Florent / Lu, Yue / Zdeborová, Lenka et al. | 2021
- 12211
-
PROJECTION EFFICIENT SUBGRADIENT METHOD AND OPTIMAL NONSMOOTH FRANK-WOLFE METHODThekumparampil, Kiran K. / Jain, Prateek / Netrapalli, Praneeth / Oh, Sewoong et al. | 2021
- 12225
-
PGM-EXPLAINER: PROBABILISTIC GRAPHICAL MODEL EXPLANATIONS FOR GRAPH NEURAL NETWORKSVu, Minh / Thai, My T. et al. | 2021
- 12236
-
FEW-COST SALIENT OBJECT DETECTION WITH ADVERSARIAL-PACED LEARNINGZhang, Dingwen / Tian, Haibin / Han, Jungong et al. | 2021
- 12248
-
MINIMAX ESTIMATION OF CONDITIONAL MOMENT MODELSDikkala, Nishanth / Lewis, Greg / Mackey, Lester / Syrgkanis, Vasilis et al. | 2021
- 12263
-
CAUSAL IMITATION LEARNING WITH UNOBSERVED CONFOUNDERSZhang, Junzhe / Kumor, Daniel / Bareinboim, Elias et al. | 2021
- 12275
-
YOUR GAN IS SECRETLY AN ENERGY-BASED MODEL AND YOU SHOULD USE DISCRIMINATOR DRIVEN LATENT SAMPLINGChe, Tong / Zhang, Ruixiang / Sohl-Dickstein, Jascha / Larochelle, Hugo / Paull, Liam / Cao, Yuan / Bengio, Yoshua et al. | 2021
- 12288
-
LEARNING BLACK-BOX ATTACKERS WITH TRANSFERABLE PRIORS AND QUERY FEEDBACKYang, Jiancheng / Jiang, Yangzhou / Huang, Xiaoyang / Ni, Bingbing / Zhao, Chenglong et al. | 2021
- 12300
-
LOCALLY DIFFERENTIALLY PRIVATE (CONTEXTUAL) BANDITS LEARNINGZheng, Kai / Cai, Tianle / Huang, Weiran / Li, Zhenguo / Wang, Liwei et al. | 2021
- 12311
-
INVERTIBLE GAUSSIAN REPARAMETERIZATION: REVISITING THE GUMBEL-SOFTMAXPotapczynski, Andres / Loaiza-Ganem, Gabriel / Cunningham, John P. et al. | 2021
- 12322
-
KERNEL BASED PROGRESSIVE DISTILLA TION FOR ADDER NEURAL NETWORKSXu, Yixing / Xu, Chang / Chen, Xinghao / Zhang, Wei / Xu, Chunjing / Wang, Yunhe et al. | 2021
- 12334
-
ADVERSARIAL SOFT ADVANTAGE FITTING: IMITATION LEARNING WITHOUT POLICY OPTIMIZATIONBarde, Paul / Roy, Julien / Jeon, Wonseok / Pineau, Joelle / Pal, Chris / Nowrouzezahrai, Derek et al. | 2021
- 12345
-
AGRFEE TO DISAGREE: ADAPTIVE ENSEMBLE KNOWLEDGE DISTILLATION IN GRADIENT SPACEDu, Shangchen / You, Shan / Li, Xiaojie / Wu, Jianlong / Wang, Fei / Oian, Chen / Zhang, Changshui et al. | 2021
- 12356
-
THE WASSERSTEIN PROXIMAL GRADIENT ALGORITHMSalim, Adıl / Korba, Anna / Luise, Giulia et al. | 2021
- 12367
-
UNIVERSALLY QUANTIZED NEURAL COMPRESSIONAgustsson, Eirikur / Theis, Lucas et al. | 2021
- 12377
-
TEMPORAL VARIABILITY IN IMPLICIT ONLINE LEARNINGCampolongo, Nicolò / Orabona, Francesco et al. | 2021
- 12388
-
INVESTIGATING GENDER BIAS IN LANGUAGE MODELS USING CAUSAL MEDIATION ANALYSISVig, Jesse / Gehrmann, Sebastian / Belinkov, Yonatan / Qian, Sharon / Nevo, Daniel / Singer, Yaron / Shieber, Stuart et al. | 2021
- 12402
-
OFF-POLICY IMITATION LEARNING FROM OBSERVATIONSZhu, Zhuangdi / Lin, Kaixiang / Dai, Bo / Zhou, Jiayu et al. | 2021
- 12414
-
ESCAPING SADDLE-POINT FASTER UNDER INTERPOLATION-LIKE CONDITIONSRoy, Abhishek / Balasubramanian, Krishnakumar / Ghadimi, Saeed / Mohapatra, Prasant et al. | 2021
- 12426
-
MATERN GAUSSIAN PROCESSES ON RIEMANNIAN MANIFOLDSBorovitskiy, Viacheslav / Terenin, Alexander / Mostowsky, Peter / Deisenroth, Marc et al. | 2021
- 12438
-
IMPROVED TECHNIQUES FOR TRAINING SCORE-BASED GENERATIVE MODELSSong, Yang / Ermon, Stefano et al. | 2021
- 12449
-
WAV2VEC 2.0: A FRAMEWORK FOR SELF-SUPERVISED LEARNING OF SPEECH REPRESENTATIONSBaevski, Alexei / Zhou, Yuhao / Mohamed, Abdelrahman / Auli, Michael et al. | 2021
- 12461
-
A MAXIMUM-ENTROPY APPROACH TO OFF-POLICY EVALUATION IN AVERAGE-REWARD MDPSLazic, Nevena / Yin, Dong / Farajtabar, Mehrdad / Levine, Nir / Gorur, Dilan / Harris, Chris / Schuurmans, Dale et al. | 2021
- 12472
-
INSTEAD OF REWRITING FOREIGN CODE FOR MACHINE LEARNING. AUTOMATICALLY SYNTHESIZE FAST GRADIENTSMoses, William / Churavy, Valentin et al. | 2021
- 12486
-
DOES UNSUPERVISED ARCHITECTURE REPRESENTATION LEARNING HELP NEURAL ARCHITECTURE SEARCH?Yan, Shen / Zheng, Yu / Ao, Wei / Zeng, Xiao / Zhang, Mi et al. | 2021
- 12499
-
VALUE-DRIVEN HINDSIGHT MODELLINGGuez, Arthur / Viola, Fabio / Weber, Theophane / Buesing, Lars / Kapturowski, Steven / Precup, Doina / Silver, David / Heess, Nicolas et al. | 2021