Round 1
Infrastructure
Concept: Independent feature and model selection with data overlaps
| Step |
Data Range |
Purpose |
| 1 |
30-70% |
Feature selection |
| 2 |
30-70% |
Initial model selection |
| 3 |
70-100% |
Grid search with CV |
Scripts:
| File |
Purpose |
Details |
backtester.py |
Compare features/models |
5-fold CV, Spearman per symbol, decaying avg over 96 periods |
fsa_feature_importance.py |
Feature selection |
FSA method (faster than Boruta) |
Workflow Details
Research
Takeaways
Full Takeaways
Infrastructure:
- S3 bucket for CUDA Docker caching
- GPU optimization
- Reusable scripts directory
- Multiprocess feature selection
- Digital Ocean persistent server ($4/month)
Process:
- More time for meaningful research
- Use
nohup python script.py > output.log 2>&1 & for background calculations
Round 2
Infrastructure
Strategy Ideas
All Ideas
| Idea |
Link |
Result |
| ML Research |
Notes |
Poor results, abandoned |
| Coin Reduction |
Notes |
Volume-based filtering |
| Cointegration |
Notes |
Regime-based ECM approach |