ExpEngine Service
The ExpEngine Service provides a RESTful API interface for managing experiments and workflows remotely. This service allows you to run, control, and monitor the execution proccess of experiments through HTTP requests, making it ideal for integration with web applications, CI/CD pipelines, or remote experiment management.
Overview
The ExpEngine Service runs as a FastAPI-based web service that exposes endpoints for:
- Starting experiments asynchronously with queuing support
- Managing experiment lifecycle (status, pause, resume, kill)
- Managing workflow lifecycle (pause, resume, kill)
- Monitoring experiment queue and execution status
- Retrieving error logs for failed experiments
- Cross-origin resource sharing (CORS) support for web applications
- User-scoped workspaces for multi-tenant deployments
Key Features
Asynchronous Execution
Experiments are submitted to a queue and executed in the background, allowing multiple concurrent experiments with configurable limits.
User Isolation
Each user can have their own workspace with isolated experiment libraries, task definitions, and dataset storage.
Queue Management
- Configurable maximum concurrent experiments (default: 4)
- Automatic queuing when capacity is reached
- Queue position tracking
- Graceful experiment cancellation
Error Logging
Failed experiments persist error logs with stack traces for debugging, with automatic cleanup after 15 minutes.
API Documentation
Service Configuration
Prerequisites
- Python >= 3.10
- FastAPI and Uvicorn
- ExpEngine package installed (
pip install eexp_engine) - Properly configured
eexp_config.pyfile
Configuration Requirements
The service requires the following configuration to be set in eexp_config.py as well:
# Required: Base workspace directory for user-scoped files
WORKSPACE_ROOT = '/path/to/workspace'
# Optional: Experiment queue settings
MAX_EXPERIMENTS_IN_PARALLEL = 4 # Default: 4
WORKSPACE_ROOT Configuration
When running as a service, WORKSPACE_ROOT defines the base directory where user-specific subdirectories will be created:
WORKSPACE_ROOT/
├── user1/
│ ├── experiments/
│ ├── tasks/
│ ├── datasets/
│ └── dependencies/
└── user2/
├── experiments/
├── tasks/
├── datasets/
└── dependencies/
Starting the Service
The service runs on http://localhost:5556 by default with Uvicorn.
# Direct execution
python api.py
# Or with custom host/port
python api.py --host 0.0.0.0 --port 8000
Docker Deployment
The recommended way to deploy the service is using Docker:
# Using docker-compose
docker-compose up -d
# Or using the pre-built image
docker pull ghcr.io/extremexp-horizon/exp-engine:latest
docker run -d -p 5556:5556 \
-v $(pwd)/eexp_config.py:/app/eexp_config.py \
-v $(pwd)/workspace:/workspace \
ghcr.io/extremexp-horizon/exp-engine:latest
Error Handling
The API returns standardized error responses in JSON format:
{
"error": {
"code": "ERROR_CODE",
"message": "Error description",
"exp_name": "ExperimentName",
"details": {}
}
}
Error Codes:
NOT_FOUND- Experiment or workflow not foundSPEC_NOT_FOUND- Experiment specification file not found in user's workspaceINTERNAL_ERROR- Server-side error during executionBAD_REQUEST- Invalid request parameters
For failed experiments, the status endpoint includes error logs with stack traces for debugging (logs are retained for 15 minutes).