PHUSE EU Connect 2025: Innovations in Clinical Programming – Synthesis and Insights
As the world of clinical data science gathers momentum at PHUSE EU Connect 2025, the panorama of presentations underscores a period of rapid evolution for the sector. From the modernisation of statistical computing environments (SCE’s), advances in automation, open-source adoption, AI integration and the imperative of robust change management, the discourse is both broad and deep. The following analysis provides a synthesis of key themes, novel contributions and practical lessons drawn from a diverse collection of papers and presentations.
Unifying Infrastructure and Governance in Clinical Data Science
A recurring focus at this year’s conference is the transition from fragmented, legacy infrastructure to unified, governed environments for statistical analysis. The ideal state, as articulated through both strategic roadmaps and technical deployments, is one in which teams centralise tooling and data access, embed quality and auditability into every workflow and leverage automation to generate reproducible, traceable outputs. This shift is motivated by intensifying regulatory scrutiny, an ever-growing deluge of clinical and real-world data and the perennial pressure to reduce time-to-submission.
Successful SCE modernisation has become a cross-disciplinary, cloud-enabled effort. By uniting proprietary (SAS) and open-source (R, Python) environments within a central architecture (which can be orchestrated through platforms such as Azure, SAS Viya and Posit Team) companies can dismantle silos and create true end-to-end lineage and auditability. Version control systems such as Git, coupled with automated CI/CD pipelines, underpin reproducibility and robust code promotion from development through validation to production. Shared metadata repositories and data fabrics, as exemplified by Debiopharm, serve to operationalise the FAIR (Findable, Accessible, Interoperable and Reusable) principles, ensuring that data is both governed and optimally leveraged across teams.
Tackling Change: Managing the Move to R and Open Source
One of the most striking trends is the widespread diversification from SAS-centric programming to include usage of increasingly open, R-based ecosystems. However, as numerous case studies demonstrate (notably those from J&J, PPD and Pfizer) realising the promise of open source is not simply a question of installing new software. Rather, it demands a concerted approach to change management, stakeholder engagement and skill development.
At the heart of successful transformation is the recognition that change is fundamentally human. Effective programmes actively engage stakeholders at every level, communicate benefit with clarity and honesty, furnish robust support mechanisms (such as up-to-date training, SMEs and communities of practice), and foster a culture in which resistance is both expected and constructively addressed. The SAIL.R initiative at J&J sets a notable example, demonstrating the power of branding, grassroots communication, and patient, people-centred change management. Pfizer’s establishment of an internal R SWAT team, alongside its R Centre of Excellence, exemplifies how dedicated expertise can act as a force multiplier, stimulating innovation, reducing technical debt and acting as a conduit between best practice and business needs.
Crucially, the journey is ongoing: sustaining momentum, embedding new behaviours, and institutionalising new ways of working require long-term monitoring, regular celebration of achievements and the cultivation of an adaptive, learning mindset.
Open Source Collaboration, Validation, and Regulatory Acceptance
The maturation of cross-industry open-source ecosystems is best illustrated by the rise of initiatives like the Pharmaverse. Here, modular R packages are built to span the entire clinical trial lifecycle, from SDTM mapping to TLF automation and submission, underpinned by meticulous validation, governance and documentation protocols. Transparency, collaboration together with a focus on reusability and traceability are held as cardinal virtues.
Adoption is not without its challenges. There remains widespread concern about validation, compliance and the transferability of open-source pipelines into regulated submissions. However, success stories are mounting: the completion of hybrid and all-R regulatory submissions, the formulation of governance frameworks (such as the R Validation Hub), and the emergence of industry-wide standards are emboldening further progress. The cross-pollination of expertise, tools and packages across companies is, on balance, reducing duplication of effort and accelerating collective capability development. Notably, the direction of travel is not only towards more open submission pipelines but also towards genuinely hybrid, cross-language solutions that seamlessly blend SAS, R and Python, each chosen for its comparative advantage.
Automation, AI and the Imperative of Human Oversight
Pharmaceutical programming is undergoing a quiet revolution driven by automation and artificial intelligence. The application of these technologies, however, must be tempered by the realities of regulatory expectations, the primacy of data integrity and the continuing necessity of human judgement.
Papers detailing the use of LLM powered programming assistants, code generation from mock tables (IDD’AI), automation of SDTM, ADaM and TLF workflows, and natural language interfaces for clinical queries (AI-powered chatbots) make clear that AI can be a formidable accelerator. Automation, when grounded in rich metadata and rigorous governance, not only reduces manual effort but delivers quality, consistency and efficiency gains throughout the data lifecycle.
Yet, none of these innovations are unconditionally risk-free. The “Human in the Middle” paradigm, advocated by Formation Bio and others, acknowledges the need for structured oversight, explainability and auditability. Regulatory guidance is converging on the principle that human expertise must remain at the heart of critical decisions, from data ingestion to analysis interpretation, particularly in GxP contexts. AI must be explainable, its outputs validated and its actions auditable. The era of unchecked black boxes is receding in favour of a more hybrid, risk-based approach that balances efficiency with trust.
Language-Agnostic Programming and Integration
With statistical programming needs expanding in both volume and complexity (including image analysis, multiomics and deep learning) organisations such as Novo Nordisk have begun to champion truly language-agnostic infrastructures. Their experience demonstrates both the promise and the pitfalls of this approach. Technical and organisational barriers abound, from data type conversion between SAS, R and Python to the need for language-agnostic storage formats such as Parquet or Dataset-JSON. Crucial to success is the standardisation of data, metadata specifications and robust validation frameworks that are shared across all languages. While these journeys are in their infancy, early successes suggest that the industry is on the cusp of building truly interoperable, multi-language analytics environments.
Synthetic Data, Collaboration and the Next Phase of Digital Trials
Before real data arrives, synthetic data (generated through parametric modelling, generative AI, or hybrid simulation) is proving to be a powerful enabler. Its role in early pipeline testing, edit check validation and robust model development is widely acknowledged and tools for its generation are proliferating. Synthetic data is especially valuable for safeguarding privacy, reducing bias and enabling broad-based scenario testing. With regulatory understanding maturing, its place in the clinical programming repertoire looks set to become permanent.
Collaboration is also increasingly central. From modular, open-source pipeline blueprints for Health Technology Assessment submissions to the joint authoring of AI governance standards, the principle of collective progress is gaining ground.
New Approaches to Automation, File Management and Audit Trails
Beyond programming languages and modelling, significant innovation is also occurring in the practical matters of file management, document automation and auditability. Commercial tools such as Aspose, tightly integrated into SAS via Java or Python, allow for robust, scripted handling and conversion of Office and PDF files within regulated workflows. Meanwhile, the integration of modern spreadsheet frontends (such as SpreadJS in Shiny) with precise audit trails (using technologies like JSON-Patch and Git) enables compliant, multi-user spreadsheet collaboration at scale.
Conclusions and Outlook
PHUSE EU Connect 2025 captures an industry in transition: redefining its technical foundations, human capital and operational models. The march towards unified, AI-assisted, open-source and interoperable environments is well under way, but change is proving as much an organisational as a technical challenge. Progress hinges on leadership in change management, strategic investment in people and communities of practice, relentless focus on validation and governance and a willingness to embrace new technological paradigms without compromising regulatory trust.
With collaboration as a guiding principle and a commitment to openness, transparency and explainability, the clinical programming community is poised to deliver not only more efficient, but also more reliable and trustworthy, science. The horizon seems broad and bright: the future belongs to those organisations that best harmonise innovation with integrity and automation with human expertise.
See more Trends & Insights