r/algotrading 1d ago

Data Roast My Stock Screener: Python + AI Analysis (Open Source)

Hi r/algotrading — I've developed an open-source stock screener that integrates traditional financial metrics with AI-generated analysis and news sentiment. It's still in its early stages, and I'm sharing it here to seek honest feedback from individuals who've built or used sophisticated trading systems.

GitHub: https://github.com/ba1int/stock_screener

What It Does

  • Screens stocks using reliable Yahoo Finance data.
  • Analyzes recent news sentiment using NewsAPI.
  • Generates summary reports using OpenAI's GPT model.
  • Outputs structured reports containing metrics, technicals, and risk.
  • Employs a modular architecture, allowing each component to run independently.

Sample Output

{
  "AAPL": {
    "score": 8.0,
    "metrics": {
      "market_cap": "2.85T",
      "pe_ratio": 27.45,
      "volume": 78521400,
      "relative_volume": 1.2,
      "beta": 1.21
    },
    "technical_indicators": {
      "rsi_14": 65.2,
      "macd": "bullish",
      "ma_50_200": "above"
    }
  },
  "OCGN": {
    "score": 9.0,
    "metrics": {
      "market_cap": "245.2M",
      "pe_ratio": null,
      "volume": 1245600,
      "relative_volume": 2.4,
      "beta": 2.85
    },
    "technical_indicators": {
      "rsi_14": 72.1,
      "macd": "neutral",
      "ma_50_200": "crossing"
    }
  }
}

Example GPT-Generated Report

## AAPL Analysis Report - 2025-04-05

- Quantitative Score: 8.0/10
- News Sentiment: Positive (0.82)
- Trading Volume: Above 20-day average (+20%)

### Summary:

Institutional buying pressure is detected, bullish options activity is observed, and price action suggests potential accumulation. Resistance levels are $182.5 and $185.2, while support levels are $178.3 and $176.8.

### Risk Metrics:

- Beta: 1.21
- 20-day volatility: 18.5%
- Implied volatility: 22.3%

---

Current Screening Criteria:

  • Volume > 100k
  • Market capitalization filters (excluding microcaps)
  • Relative volume thresholds
  • Basic technical indicators (RSI, MACD, MA crossover)
  • News sentiment score (optional)
  • Volatility range filters

How to Run It:

git clone [https://github.com/ba1int/stock_screener.git](https://github.com/ba1int/stock_screener.git)
cd stock_screener
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows
pip install -r requirements.txt

Add your API keys to a .env file:

OPENAI_API_KEY=your_key
NEWS_API_KEY=your_key

Then run:

python run_specific_component.py --screen     # Run the stock screener
python run_specific_component.py --news       # Fetch and analyze news
python run_specific_component.py --analyze    # Generate AI-based reports

Tech Stack:

  • Python 3.8+
  • Yahoo Finance API (yfinance)
  • NewsAPI
  • OpenAI (for GPT summaries)
  • pandas, numpy
  • pytest (for unit testing)

Feedback Areas:

I'm particularly interested in critiques or suggestions on the following:

  1. Screening indicators: What are the missing components?
  2. Scoring methodology: Is it overly simplistic?
  3. Risk modeling: How can we make this more robust?
  4. Use of GPT: Is it helpful or unnecessary complexity?
  5. Data sources: Are there any better alternatives to the data I'm currently using?
71 Upvotes

17 comments sorted by

4

u/E-raticSamurai 1d ago

Thanks for sharing, I know it takes a good amount of effort to do so. I am in the early stages of something similar. I recently pulled back on so much complexity and am working to simplify and can really appreciate what you have accomplished.

For sentiment, have you received valuable output? I have dabbled but am not confident in my output yet.

For options, are you pulling from yf api?

This could be a silly question, but the ‘risk’ and ‘volatility’ are measuring the stock or options?

RVOL: which time frame are you using? I am using 15-minute.

2

u/szotyimotyi 22h ago

Really appreciate the kind words — and totally relate to pulling back on complexity. I’ve found it’s easy to over-engineer in the early stages, so simplifying has definitely helped me stay focused.

For sentiment, the output is hit or miss depending on the stock. Larger caps with consistent news coverage tend to yield more reliable sentiment signals. I’m still experimenting with weighting it properly in the scoring model.

For options data, I’m currently not pulling from the yfinance options endpoint directly — still figuring out the best way to incorporate that without overwhelming the analysis. Definitely something on the roadmap though.

As for risk and volatility — good question. Right now it’s based on the underlying stock, not the options. Mostly using historical 20-day volatility and implied volatility from whatever’s available via yfinance.

For RVOL, I’m using a daily timeframe at the moment (current volume vs 20-day avg volume), but a 15-minute RVOL sounds really interesting for intraday signals. Might try that out — thanks for the idea!

4

u/vanveekay 23h ago

Models should have valuable output like optimal position to form a portfolio.

Only then you can run a backrest.

I don’t see any value addition in the above.

1

u/szotyimotyi 22h ago

Thanks for the feedback — really appreciate you taking the time. It’s still a very early-stage project, so this kind of input is super valuable as I continue building it out.

2

u/stoic_trader 14h ago

This sounds like a very interesting project. Coincidentally, I am working on a similar project, but it is a lower priority as I am a short-time frame quant trader.

Here is my pipeline, which you may find intriguing: Instead of using ChatGPT, I am utilizing LLMs locally through Ollama, such as Llama 3.2 3B, which excels at summarization and requires fewer resources. You can do sentiment analysis as well.

The front end is open-webui, which has robust RAG capabilities. You can store all your data in a local folder, integrate it with open-webui RAG, link Ollama through open-webui, and interactively ask questions. You can also store a file with your portfolio in the same folder and inquire about optimization etc.

Suggestion: Instead of News, you can pull Options data and get a market data-driven sentiment analysis by calculating IVs, PE & CE unwinding, etc. Besides, you can use it for risk management using IV Percentile. IVs are still not that dependable but they are not subjective as running sentiment analysis on news events.

2

u/szotyimotyi 11h ago

that’s super interesting. I’ve heard good things about Ollama but haven’t tried it yet.

Love the idea of using options data for sentiment and risk modeling. IV Percentile and tracking PE/CE unwinding make a lot of sense, especially since they’re grounded in market data rather than subjective news flow. That kind of signal feels a lot more actionable for shorter time frames too.

Thanks again for the thoughtful suggestions!

4

u/6jSByqJv 1d ago

Looks like lazy AI sludge.

29

u/jawanda 1d ago

He's doing more than just throwing data at AI, and he even asks the question:

Use of GPT: Is it helpful or unnecessary complexity?

Don't get discouraged op. While this program may or may not ever produce good results for you, it's obvious you've put some effort into it and I say you keep building and seeing how its summaries hold up against actual market performance.

6

u/szotyimotyi 1d ago

thanks! while yes I heavily use AI, but I dont think its necessarily a bad thing since its trained on a lot more data than I am. Its very good in the morning to have a baseline to start the daily research looking into stocks that might be worth looking at. So even if it saves 30 minutes of tedious research than im further off, so thats where im coming from. Of course I would like to have a system that tells me when something is going to go up for sure, but thats not very realistic.

1

u/PermanentLiminality 20h ago

Looks interesting. I'll have to give it a shot

I like Openrouter because you can try different models so easily and a lot of them are cheaper. You can keep using the openAI python module.

Only matters if you are going to slam it.

1

u/Chemical_Winner5237 14h ago

where do you get your news from?

1

u/soman_yadav 14h ago

I’ve been exploring ways GPTs (ChatGPT, Claude, etc.) can plug into the research/coding/backtesting loop, especially in Python-heavy workflows.
Are you using LLMs in your algo trading stack yet — for strategy generation, code troubleshooting, or live pipeline assistance?

Also curious: if you’ve tested GPTs for modeling or signal generation, what were your takeaways?

Sent you a DM too

1

u/Antoni_Nabzdyk 13h ago

I do think that AI is in its growth stage, and see room for AI powered analysis

0

u/kotkaani 11h ago

Thanks man.. you have excelled the realm of Gods.