FrozenLake¶
Experiment Quick Start Guide
This guide helps you quickly set up and run FrozenLake experiments with ReMe integration. The FrozenLake experiment demonstrates how task memory can improve an agent’s performance in a navigation task.
Environment Setup¶
1. Clone the Repository¶
git clone https://github.com/agentscope-ai/ReMe.git
cd ReMe/cookbook/frozenlake
2. FrozenLake Environment Setup¶
Install Gymnasium for FrozenLake environment:
pip install gymnasium
This will install:
gymnasium - for the FrozenLake environment
ray - for parallel execution
openai - for LLM API access
other dependencies
3. Start ReMe Service¶
If you haven’t installed ReMe yet, follow these steps:
# Go back to the project root
cd ../..
# Create a virtual environment (optional)
conda create -p ./reme-env python==3.10
conda activate ./reme-env
# Install ReMe
pip install .
Launch the ReMe service to enable memory library functionality:
reme \
backend=http \
http.port=8002 \
llm.default.model_name=qwen-max-2025-01-25 \
embedding_model.default.model_name=text-embedding-v4 \
vector_store.default.backend=local
Add your api key for agent:
export OPENAI_API_KEY="xxx"
export OPENAI_BASE_URL="xxx"
Run Experiments¶
1. Quick Test: Performance Evaluation Only (Default)¶
Run the main experiment script to test agent performance using existing memory:
cd cookbook/frozenlake
python run_frozenlake.py
What this does:
Tests the agent on randomly generated FrozenLake maps
Uses the default memory library (
frozenlake_no_slippery)Evaluates performance with multiple runs for statistical significance
Results are automatically saved to
./exp_result/directory
2. Advanced: Training + Testing (Memory Generation)¶
To create new memories through training and then test performance:
You can modify the experiment parameters directly in the run_frozenlake.py file. The main parameters are in the main() function:
def main():
experiment_name = "frozenlake_no_slippery" # Name of the experiment
max_workers = 4 # Number of parallel workers
training_runs = 4 # Runs per training map
num_training_maps = 50 # Number of maps for training
test_runs = 1 # Runs per test configuration
num_test_maps = 100 # Number of test maps
is_slippery = False # Enable slippery mode
Key parameters to consider:
experiment_name: Used as the workspace ID for task memoryis_slippery: When True, agent movement becomes stochastic (harder)max_workers: Increase for faster execution on multi-core systems
3. View Experiment Results¶
After running experiments, analyze the statistical results:
python run_exp_statistic.py
What this script does:
Processes all result files in
./exp_result/Calculates success rates and performance metrics
Generates a summary table showing performance comparisons
Analyzes the effect of task memory on performance
Saves results to
frozenlake_summary.csv
Understanding the Implementation¶
Key Components¶
FrozenLakeReactAgent (
frozenlake_react_agent.py)Implements a ReAct agent that interacts with the FrozenLake environment
Handles task memory retrieval and storage
Uses LLM (via OpenAI API) for decision making
Experiment Runner (
run_frozenlake.py)Manages the overall experiment flow
Handles training and testing phases
Uses Ray for parallel execution
Map Manager (
map_manager.py)Generates and manages test maps
Ensures consistent evaluation across experiments
Statistics Analyzer (
run_exp_statistic.py)Processes experiment results
Calculates performance metrics
Generates comparative analysis
Output Files¶
./exp_result/*_training.jsonl: Results from training phase./exp_result/*_test_no_memory.jsonl: Test results without task memory./exp_result/*_test_with_memory.jsonl: Test results with task memory./exp_result/frozenlake_summary.csv: Statistical summary
Task Memory Mechanism¶
The task memory system works as follows:
Memory Creation: During training, successful trajectories are sent to the ReMe service
Memory Retrieval: During testing, the agent queries relevant memories based on the current map
Memory Application: The agent uses retrieved memories to guide its decision-making
The experiment demonstrates how task memory can significantly improve performance, especially in challenging environments like the slippery FrozenLake.