Predicting Founder Success Without an LLM:
An Interpretable Tree-Based Approach to VCBench

Maheni Soumah; Jessica Mbounkap; Habiba Djigo

doi:10.33774/coe-2026-36z59

Computer Science

Search within Computer Science

Predicting Founder Success Without an LLM: An Interpretable Tree-Based Approach to VCBench

23 May 2026, Version 1

Working Paper

Show author details

This content is an early or alternative research output and has not been peer-reviewed by Cambridge University Press at the time of posting.

Abstract

Predicting the success of startup founders from their professional profile is a high-stakes task that has recently become tractable thanks to the public release of VCBench [Chen et al., 2025], a benchmark of 9,000 anonymised founder profiles. The current public leaderboard is dominated by large language models, which deliver state-of-the-art F0.5 scores at a substantial monetary, computational, and interpretability cost. We ask whether a fully interpretable, freely reproducible tabular approach can close most of this gap. Starting from the structured JSON fields of the public VCBench split (4,500 founders, 9% positive rate), we engineer 42 features grouped into four tiers (prior exits, education, career, and industry) and benchmark four classical models—Logistic Regression, Random Forest, XGBoost, and LightGBM—under the same 6-fold cross-validation protocol as the original paper, with out-of-fold threshold tuning to optimise F0.5. Our best model, a Random Forest, reaches F0.5 = 0.246 (precision 25.1%, recall 23.5%) on the public split, on par with the structured-ML baselines on the leaderboard, and roughly 13× better than the average venture-capital fund in real-world precision. SHAP analysis reveals that the predictions are driven by interpretable founder characteristics: almamater prestige (QS world ranking), prior exits, exposure to large organisations, and industry alignment. Our approach costs zero in API fees, runs end-to-end on a laptop in under two minutes, and is fully auditable—three properties that current LLM-based competitors lack and that matter for regulated decision-making in venture capital.

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting and Discussion Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

May 23, 2026 Version 1

Metrics

146

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.33774/coe-2026-36z59

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content