Questions to RL panel @ RL+LLM workshop @ AAAI 2024
saved
Ideas
Pros and cons
Votes
We have seen in Robot Learning that providing a single demonstration before training allows an agent to converge very quickly to ideal behaviour, even if sub-optimal. Could a similar idea be applied using LLMs where a user could provide textual context/demonstrations to avoid cold start in traini...more
by Dan UoY
1
Vote
Dan
The current approaches focused on using LLMs to define reward function or to evaluate reward directly. Can we use LLMs as a world model in model based RL approaches? We know that we need a step size less than microsecond for on-policy algorithms, how can we achieve that?
by Salar rahili
1
Vote
Amir Mohammad Dorpoush
How can we guide RL agents during training? Can we use policy conditioned on text from the user to help the agent overcome a bottleneck behavior? (conditioning the reward function is not the best option since it can’t be transferable at inference)
by Salar Rahili
1
Vote
Dan
What are the potential threats of using RL and LLMs combined? How can we avoid them?
by Ali Vaziri
1
Vote
Ali Vaziri
How can Researcher combine RL and LLMS with visual representation? There are very few resources available for RL and LLMS. How can we provide easy and clear meaning full resources about RL, LLMS such math for RL, LLMs etc?
by Mahammad Humayoo
1
Vote
Mahammad Humayoo
Thoughts on using LLMs as a powerful representation for RL to tackle problems such as non-Markovian domains and zero-shot learning?
1
Vote
Dan
Will LLMs help build AI agents? How? Related: Can current LLMs plan / reason? Why or why not? Any successful LLMs powered AI agents?
1
Vote
Mahammad Humayoo
How can we leverage the multimodal language models to enrich the observations available to RL agents and improve decision-making?
by Maryam Hashemi
0
Vote
Long horizon planning using LLMs were more successful compared to direct RL policy training due to the slow autoregressive interactive nature of LLMs and the slow process of transforming observation to text/prompts. Can VLMs help us here to speed up?
by Elahe Aghapour
0
Vote
LLMs/VLMs are being trained on internet data and mostly can be used as a high level planner or reward function. Is there any potential in VLM/LLMs in future to play as a foundation model in RL? How can we connect the world model knowledge to a diverse set of actions for different tasks?(meta-mult...more
by Elahe Aghapour
0
Vote
In RL, the approach of learning from scratch results in data inefficiency. How can we enhance efficiency by incorporating knowledge from LLMs as an initial policy similar to humans, avoiding starting from scratch?
by Elahe Aghapour
0
Vote
What is required to learn a world model? Internet text data?Multimodality data?Embodiment/interaction data? All of the above? Something else?
0
Vote
LLMs have been shown to perform well on decomposing high-level goal into sub-tasks. Can we leverage this capability to create hierarchy or options for RL agents?
0
Vote
Can we leverage LLMs to provide explainability for RL approaches?
0
Vote
How LLMs may help RL? How RL may help LLMs? What may be the best way for RL and LLMs to integrate?
0
Vote
Is prompting, maybe together with mechanisms like ReAct and/or RAG, enough for building LLMs powered AI agents? Will fine-tuning or even pre-training necessary?