skip to main content
10.5555/645530.655817guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Off-Policy Temporal Difference Learning with Function Approximation

Published: 28 June 2001 Publication History

Abstract

No abstract available.

Cited By

View all
  • (2023)Supported value regularization for offline reinforcement learningProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667888(40587-40609)Online publication date: 10-Dec-2023
  • (2023)Hallucinated adversarial control for conservative offline policy evaluationProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3626000(1774-1784)Online publication date: 31-Jul-2023
  • (2023)Modified retrace for off-policy temporal difference learningProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625863(303-312)Online publication date: 31-Jul-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICML '01: Proceedings of the Eighteenth International Conference on Machine Learning
June 2001
629 pages

Publisher

Morgan Kaufmann Publishers Inc.

San Francisco, CA, United States

Publication History

Published: 28 June 2001

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Supported value regularization for offline reinforcement learningProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667888(40587-40609)Online publication date: 10-Dec-2023
  • (2023)Hallucinated adversarial control for conservative offline policy evaluationProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3626000(1774-1784)Online publication date: 31-Jul-2023
  • (2023)Modified retrace for off-policy temporal difference learningProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625863(303-312)Online publication date: 31-Jul-2023
  • (2023)DoMo-ACProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619810(33657-33673)Online publication date: 23-Jul-2023
  • (2023)Supported trust region optimization for offline reinforcement learningProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619401(23829-23851)Online publication date: 23-Jul-2023
  • (2023)On the reuse bias in off-policy reinforcement learningProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/502(4513-4521)Online publication date: 19-Aug-2023
  • (2020)Fast adaptation to new environments via policy-dynamics value functionsProceedings of the 37th International Conference on Machine Learning10.5555/3524938.3525672(7920-7931)Online publication date: 13-Jul-2020
  • (2020)Gradient temporal-difference learning with regularized correctionsProceedings of the 37th International Conference on Machine Learning10.5555/3524938.3525268(3524-3534)Online publication date: 13-Jul-2020
  • (2020)MOPOProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3496909(14129-14142)Online publication date: 6-Dec-2020
  • (2020)Self-imitation learning via generalized lower bound Q-learningProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3496895(13964-13975)Online publication date: 6-Dec-2020
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media