Questions to RL panel @ RL+LLM workshop @ AAAI 2024 - brainstorming and voting

We have seen in Robot Learning that providing a single demonstration before training allows an agent to converge very quickly to ideal behaviour, even if sub-optimal. Could a similar idea be applied using LLMs where a user could provide textual context/demonstrations to avoid cold start in traini... more

by Dan UoY

1

Vote

Dan

The current approaches focused on using LLMs to define reward function or to evaluate reward directly. Can we use LLMs as a world model in model based RL approaches? We know that we need a step size less than microsecond for on-policy algorithms, how can we achieve that?

by Salar rahili

1

Vote

Amir Mohammad Dorpoush

How can we guide RL agents during training? Can we use policy conditioned on text from the user to help the agent overcome a bottleneck behavior? (conditioning the reward function is not the best option since it can’t be transferable at inference)

by Salar Rahili

1

Vote

Dan

What are the potential threats of using RL and LLMs combined? How can we avoid them?

by Ali Vaziri

1

Vote

Ali Vaziri

How can Researcher combine RL and LLMS with visual representation? There are very few resources available for RL and LLMS. How can we provide easy and clear meaning full resources about RL, LLMS such math for RL, LLMs etc?

by Mahammad Humayoo

1

Vote

Mahammad Humayoo

Thoughts on using LLMs as a powerful representation for RL to tackle problems such as non-Markovian domains and zero-shot learning?

1

Vote

Dan

Will LLMs help build AI agents? How? Related: Can current LLMs plan / reason? Why or why not? Any successful LLMs powered AI agents?

1

Vote

Mahammad Humayoo

How can we leverage the multimodal language models to enrich the observations available to RL agents and improve decision-making?

by Maryam Hashemi

0

Vote

Long horizon planning using LLMs were more successful compared to direct RL policy training due to the slow autoregressive interactive nature of LLMs and the slow process of transforming observation to text/prompts. Can VLMs help us here to speed up?

by Elahe Aghapour

0

Vote

LLMs/VLMs are being trained on internet data and mostly can be used as a high level planner or reward function. Is there any potential in VLM/LLMs in future to play as a foundation model in RL? How can we connect the world model knowledge to a diverse set of actions for different tasks?(meta-mult... more

by Elahe Aghapour

0

Vote

In RL, the approach of learning from scratch results in data inefficiency. How can we enhance efficiency by incorporating knowledge from LLMs as an initial policy similar to humans, avoiding starting from scratch?

by Elahe Aghapour

0

Vote

What is required to learn a world model? Internet text data？Multimodality data？Embodiment/interaction data? All of the above? Something else?

0

Vote

LLMs have been shown to perform well on decomposing high-level goal into sub-tasks. Can we leverage this capability to create hierarchy or options for RL agents?

0

Vote

Can we leverage LLMs to provide explainability for RL approaches?

0

Vote

How LLMs may help RL? How RL may help LLMs? What may be the best way for RL and LLMs to integrate?

0

Vote

Is prompting, maybe together with mechanisms like ReAct and/or RAG, enough for building LLMs powered AI agents? Will fine-tuning or even pre-training necessary?

0

Vote

Login

Comments

About tricider

The Company

Legal stuff

Get in touch

About Brainstorming

About Crowdsourcing