Open Source Adoption as an Indicator of AI Readiness

Reading time:
time
min
By:
Rafael Pereira
February 12, 2026

Author's note

I have watched the same pattern repeat across regulated analytics teams. The organizations that adopt open source well tend to move faster with AI, and they do it with more control. That does not happen by luck: the infrastructure, practices, and cultural shifts required for open-source adoption from version control, reproducible environments, automated pipelines, to collaborative development, are the building blocks which successful modern engineering and AI implementations are built on. 

In this article, I share why I believe open source maturity works as a practical signal of AI readiness, especially in pharma. I draw on industry research and examples from organizations that moved from cautious adoption to repeatable, compliant delivery. If you want an honest starting point for your AI program, start by looking at how you adopt open source tooling today.

Note that the definitions can vary significantly from org to org, or team to team, based on context. In this article:

“AI Readiness” in this article means your ability to move from AI pilots to repeatable, controlled use in production. You can ship changes, reproduce results, explain decisions, and pass internal quality review.

“Open Source Maturity” reflects your organization's capability to adopt R, Python (or similar languages), and their expansive, community-driven package ecosystems to achieve analytical objectives — statistical analysis, clinical outputs and regulatory submissions, while maintaining compliance and operational continuity.

Introduction

The conversation around AI readiness typically centers on data quality, talent acquisition, and executive buy-in. While these factors matter, they overlook a more fundamental question: does the organization have the technical and cultural infrastructure to operationalize AI at scale? 

Open source maturity works as a strong indicator of AI readiness because it forces an organization to build the same operational capabilities that AI depends on. If your teams can onboard external packages in a regulated setting, you already know how to control inputs, prove outputs, and manage change. AI adds complexity, but it rarely adds a new category of operational need.

According to the Linux Foundation's research on the economic impacts of open-source AI, 89% of organizations that have adopted AI use open-source AI tools and models in some form within their infrastructure. The State of Global Open Source 2025 report found that 83% of enterprises consider open-source adoption valuable to their future, with 82% viewing it as an asset that enables innovation. Meanwhile, enterprise AI adoption has reached 88% according to McKinsey, with firms using open-source AI reporting 35% reduction in total cost of ownership compared to proprietary solutions. To me, this suggests that open-source adoption and AI readiness share common prerequisites.

The shared foundation

Open-source adoption and AI implementation both require organizations to build capabilities that legacy, proprietary environments often lack. When an organization successfully transitions to open-source tools, it necessarily develops these four foundational capabilities:

Version Control and Reproducibility

Open-source workflows are built on version control systems like Git. Code, configurations, and increasingly data pipelines are tracked, versioned, and auditable. This same infrastructure is essential for AI: model versioning, experiment tracking, and reproducible training runs all depend on version control maturity. Organizations that have adopted Git-based workflows for their analytical code have already cleared a significant hurdle for AI operationalization.

Containerization and Environment Management

Open-source tools like R and Python require careful dependency management. Organizations that have solved this challenge (through containers, virtual environments, or package management systems) have built infrastructure that directly supports AI deployment. The same container that ensures an R analysis runs identically across development and production can host an ML model serving predictions.

Automated Pipelines and CI/CD

Mature open-source environments incorporate continuous integration and deployment. Code is automatically tested, validated, and deployed through pipelines. This automation culture is prerequisite for ML operations (MLOps), where models must be continuously retrained, validated, and deployed. Organizations without CI/CD maturity struggle to move AI from prototype to production.

Collaborative Development Culture

Open-source adoption shifts organizations from siloed, individual work to collaborative, transparent development. Code reviews, shared repositories, and cross-functional contributions become normal. AI projects demand similar collaboration: data scientists, engineers, domain experts, and compliance teams must work together. Organizations that have already adopted collaborative open-source workflows find this transition natural.

Common ways AI programs stall or fail

The beforementioned foundations map directly to the common ways AI programs stall or fail.

  1. Weak version control and traceability leads to models (or experiments) you cannot reproduce or explain.
  2. Weak environment control leads to results that change across machines and releases (“it worked on my machine!”).
  3. Weak automated checks leads to slow releases and high manual review load (taking months to change a small configuration in a complex system).
  4. Weak collaboration and review leads to shadow AI and undocumented decisions (people bypass controlled mechanisms for the sake of progress).

Evidence from regulated industries

The pharmaceutical industry provides compelling evidence of the open-source-to-AI pathway. Companies like Roche, Novartis, GSK, and Novo Nordisk have invested heavily in open-source infrastructure over the past decade. Their journeys reveal a consistent pattern: organizations that built robust open-source foundations are now the ones most actively exploring AI integration.

Roche, which achieved the first end-to-end R-based regulatory submission to the FDA, EMA, and China's NMPA, maintains over 100 public repositories under its insightsengineering GitHub organization. The infrastructure built to validate R packages: automated testing, risk assessment, reproducible environments, now supports their exploration of AI-augmented validation and automated quality control.

Similarly, Novartis operates one of the most transparent open-source validation programs in the industry, with over 1,100 packages publicly documented with risk assessments. Their data shows that only 7.4% of packages experienced risk rating changes over time. This mature infrastructure positions them well for AI adoption: the same validation frameworks that govern R packages can be extended to AI models.

GSK's journey is particularly instructive. The company committed to having 50% of all code in open-source languages by the end of 2025, training over 80% of their 1,000+ biostatisticians in R. Their infrastructure includes frozen environments for production consistency and the WARP analytics platform. Companies that have made these investments have built the very platforms that AI requires.

The ROI connection

Open source maturity does not guarantee AI success. It gives you, however, a strong signal that you can scale AI responsibly. Your data quality and governance still matter, but open source maturity tells you whether you can operationalize whatever AI you build, or whatever you build with AI. 

That said, there is research showing that organizations using open-source AI tools report 35% lower total cost of ownership compared to proprietary-only approaches, with some studies showing 25% higher ROI. While causation is difficult to establish, there seems to be a pattern here: open-source adopters appear to extract more value from their AI investments.

Several factors likely contribute. Open-source environments reduce the cost of experimentation, allowing organizations to iterate on AI use cases without significant licensing overhead. The flexibility of open-source tools enables customization to specific organizational needs. Perhaps most importantly, organizations with open-source experience have typically developed the internal expertise to implement and maintain AI systems rather than depending entirely on vendors.

The Linux Foundation estimates that companies would have to spend 3.5 times more if open-source software did not exist, and that as AI adoption increases, open-source models will drive even greater cost savings than traditional open-source software. For organizations planning AI investments, building open-source capabilities first may be the most cost-effective path.

Assessing AI readiness through open-source maturity

For organizations evaluating their AI readiness, examining open-source maturity provides a practical framework. Key questions may include:

Version Control Adoption: Is code stored in version control systems? Are commits attributed and timestamped? Can changes be traced and rolled back? Organizations with immature version control practices will struggle with AI experiment tracking and model versioning.

Environment Reproducibility: Can analytical environments be reliably recreated? Are dependencies managed systematically? Organizations that cannot reproduce their current analyses will face significant challenges reproducing AI model training.

Automation Maturity: Do automated pipelines exist for testing and deployment? Is manual intervention required for routine operations? AI at scale requires automated retraining, validation, and deployment. Organizations without existing automation will need to build these capabilities before AI can move beyond proof-of-concept.

Collaborative Infrastructure: Do teams share code and collaborate through formal processes? Are code reviews standard practice? AI projects require cross-functional collaboration that is difficult to achieve in siloed environments.

But again, context matters, and every organization sets this bar differently. For example, a late stage clinical reporting workflow may require locked environments and formal approvals, while early research can accept more change as long as you can trace it. The questions we showed should be treated as a guideline to assess your current state and spot the one or two gaps that will block AI scaling.

Implications for strategy

If open-source adoption predicts AI readiness, the strategic implications are significant. Organizations planning AI initiatives should consider whether their current infrastructure can support them. Investing in AI without the underlying foundations often leads to pilots that never scale.

Conversely, organizations that have delayed AI exploration but invested in open-source modernization may be better positioned than they realize. The infrastructure that supports R and Python workflows (e.g. containerization, CI/CD, version control, collaborative development)is more often than not directly applicable to AI. These organizations may find that AI adoption requires incremental rather than transformational change.

For organizations still dependent on legacy proprietary systems, the path to AI runs through open-source adoption. Attempting to implement AI on top of monolithic, poorly-versioned, manually-deployed systems is likely to fail. The more pragmatic approach is to build open-source capabilities first, then leverage that foundation for AI.

Conclusion

You can treat open source maturity as a practical proxy for AI readiness. If you struggle to validate and maintain open source packages, or modern tooling, with clear evidence and change control, you will struggle even more when models update more often, behave less predictably,require ongoing monitoring, and bring in a slew of unknown operational tools and controls.

Working in regulated analytics, we often see a clear pattern: teams that adopt open source well, also scale AI faster and more safely.  Organizations that have embraced open-source tools have built the technical infrastructure, automation capabilities, and collaborative culture that AI requires. They have developed internal expertise in modern development practices. They have created environments that support experimentation and rapid iteration.

For organizations evaluating where they stand on the AI readiness spectrum, assessing open-source maturity offers a practical diagnostic. Strong open-source foundations suggest an organization is well-positioned for AI adoption. Weak foundations suggest that AI investments may be premature. Building open-source capabilities first may be the more effective path.

The organizations leading AI adoption today are not those that waited for AI and then scrambled to build infrastructure. They are the ones that invested in modern, open-source-first platforms years ago and now find themselves naturally positioned for the AI era. Faster open-source adoption leads to faster AI adoption. 

References

[1] Linux Foundation. (2024). The Economic and Workforce Impacts of Open Source AI. Available at: https://www.linuxfoundation.org/research/economic-impacts-of-open-source-ai. Key finding: "89% of organizations that have adopted AI use open-source AI tools and models in some form."

[2] Marco Gerosa and Adrienn Lawson, “The State of Global Open Source 2025”, foreword by Jon Seager, The Linux Foundation, October 2025. Available at: https://canonical.com/blog/state-of-global-open-source-2025. Key finding: "83% of enterprises consider open-source software adoption valuable to their future; 82% view open source as an asset that enables innovation."

[3] McKinsey & Company. (2025). The State of AI: Global Survey. Available at: https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai. Key finding: "88% of organizations report using AI in at least one business function (up from 78% in 2024)."

[4] Market.us. (2025). Open-Source AI Model Market Size Report. Available at: https://market.us/report/open-source-ai-model-market/. Key finding: "Firms using open-source AI report a 35% reduction in total cost of ownership compared to full proprietary solutions."

[5] Deloitte. (2026). The State of AI in the Enterprise. Available at: https://www.deloitte.com/us/en/what-we-do/capabilities/applied-artificial-intelligence/content/state-of-ai-in-the-enterprise.html. Key finding: "Worker access to AI rose by 50% in 2025; 42% of companies believe their strategy is highly prepared for AI adoption."

[6] R Validation Hub (pharmaR). (2022). Case Studies: Novartis R Package Risk Assessment. Available at: https://pharmar.org/posts/case-studies/novartis-case-study/. Key finding: "Novartis defined ten risk assessment criteria; packages categorized into Low, Medium, and High risk tiers with corresponding validation requirements."

[7] Appsilon. (2024). GSK's Open-Source Shift: Training 1,000 Biostatisticians in R. Available at: https://www.appsilon.com/post/gsk-r-adoption-journey. Key finding: "GSK committed to 50% of all code in open-source languages by end of 2025; over 80% of 1,000+ biostatisticians trained in R."

[8] Roche insightsengineering. (2022-2025). GitHub organization with 100+ public repositories. Available at: https://github.com/insightsengineering. Key finding: "Roche achieved the first end-to-end R-based NDA submission to the FDA, EMA, and China's NMPA in 2022."

Have questions or insights?

Engage with experts, share ideas and take your data journey to the next level!

Stop Struggling with Outdated Clinical Data Systems

Join pharma data leaders from Jazz Pharmaceuticals and Novo Nordisk in our live podcast episode as they share what really works when building modern, compliant Statistical Computing Environments (SCEs).

Is Your Software GxP Compliant?

Download a checklist designed for clinical managers in data departments to make sure that software meets requirements for FDA and EMA submissions.

Ensure Your R and Python Code Meets FDA and EMA Standards

A comprehensive diagnosis of your R and Python software and computing environment compliance with actionable recommendations and areas for improvement.
Explore Possibilities

Share Your Data Goals with Us

From advanced analytics to platform development and pharma consulting, we craft solutions tailored to your needs.