August 6, 2010

E-Separation Solutions

E-Separation Solutions-08-12-2010, Volume 0, Issue 0

Joining us for a discussion on software are Tony Owen, Agilent Technologies; Shaun Quinn, Dionex; David Chiang, Sage-N Research; and Edward C. Long, Thermo Fisher Scientific.

With the rapid pace of technological advancement in all areas of chromatographic instrumentation in recent years, software has become more critical than ever to lab productivity.

Joining us for a discussion on software this month are Tony Owen of Agilent Technologies, Inc.; Shaun Quinn of Dionex Corporation; David Chiang of Sage-N Research; and Edward C. Long of Thermo Fisher Scientific.

What trends do you see emerging in software? How has software been evolving?

Owen: Analytical data systems are moving from single workstation, proprietary solutions, to distributed, collaborative systems that are connected through centralized SDMS (simple document management system), LIMS (laboratory information management system), and ELN (electronic laboratory notebook). Open standards and shared services (i.e., licensing, user management, instrument management) provide increased productivity and the ability to focus on integration and development of application specific software solutions. We are seeing increasing demand for multivendor instrument control and data interchange standards with technology neutral formats (TNF), such as ANiML. Improved efficiency is being enabled through workflow automation, optimized utilization of instruments, and lab personnel to maximize lab productivity. More lab personnel are requiring adoption of software industry standards to improve their day-to-day operations. Finally, there is a movement away from paper-based processes towards electronic-based processes (e-sig, IP protection).

Quinn: Operational trends that improve and simplify the user experience in scientific software have been emerging and I believe are set to continue to evolve. In particular, further automation of tasks and providing guided workflows that reduce the users’ overall effort to generate results are set to carry forward. The aim is less interaction whilst still exploiting the full functionality that the software offers.

Chiang: In mass spectrometry-based proteomics, the big challenge is the geometric explosion of data. This has required software to adapt in the following ways: firstly, the trend from PC to server software in order to achieve the robustness and throughput required. Just as software to manage the corner grocery store is not enough to manage Walmart, such is the need for software that used to handle a handful of proteins in a simple gel spot to the full proteome of organelles. Proteomics analysis is now akin to software used by hedge funds to undercover hidden market trends, much like proteomics search engines are used to uncover low-abundance peptides and proteins and their post-translational modifications.

Secondly, interactive to automated analyses as it is no longer feasible for a technician to sit there and click on menus where the datasets are so large. PC software, designed for manual interrogation of simpler datasets, is no longer sufficient, and required robust server-class, statistically solid algorithms to sieve through the haystack to find the needles.

Long: Software, especially chromatography data system (CDS) software, is evolving more and more to provide a comprehensive coverage of all factors affecting the chromatography information. It’s no longer just enough to have CDS software provide sound peak detection, integration, and reporting. There are so many other factors that chromatographers need to understand in gas and liquid chromatography separations, so that the CDS software is evolving to address those “other factors” along with the basic mathematical treatment of the data. Key experimental instrument parameters for flow, backpressure, gradient composition, and even temperature can affect the liquid chromatography separations in most mixtures, but only recently has CDS software incorporated capabilities to track, monitor, and record this information along with the detector outputs.

Another evolving trend in software is to integrate flexible reporting as part of the total system. Many chromatography labs still output their processed information from a CDS into other software packages like Excel, Word, or PowerPoint. Part of this comes from the limiting reporting capabilities in CDS systems; part comes from the rigid nature of many reporting engines in CDS packages. Not only does this degrade laboratory efficiency, but it also introduces additional problems in laboratory validation and quality of results when multiple software packages are routinely used. Evolution of CDS software to provide a more comprehensive and intelligent form of reporting is occurring in new CDS software under development.

Not only does chromatography generate vast amounts of information with each run, but deeper information can be gleaned (patterns, trends, sample to sample variations, etc.) in chromatography laboratories, but they are limited by CDS systems that are unable to rapidly and easily correlate and organize all the chromatography results. Managing the chromatography results in structured databases has been occurring for a number of years and this trend will accelerate in the coming years as laboratories must find more efficient ways to manage the wealth of their chromatography information.

By managing all of this chromatography information, the evolving CDS will become more useful and informative to the chromatographer, enabling them to better understand their information and make it ultimately more valuable for everyone.

What is the software application you see growing the fastest?

Owen:Electronic laboratory notebooks are the fastest growing area, focused mostly around IP protection and discovery rights, however, the market is very fragmented as it is based on workflow, much like LIMS. Data management moving from archival/storage to true SDMS. Laboratories, including pharmaceutical, will extend their borders to outsource more services that they should buy versus own. This includes the need for adaptability. For example, the integration of new entities, collaboration between and across entities, remote access and mobility, optimizing asset usage, and the accommodation of local practices.

Quinn:Revitalized chromatographic data systems that are not only compatible with modern computing technologies such as .NET and 64 bit, but also meet the demands of the latest laboratory technologies such as UHPLC. With 64-bit computing set to become the standard, companies migrating their hardware or upgrading their environments will force a resurgence in compatible chromatographic data systems. In addition, with the fast growing UHPLC instruments sector, there is a directly proportional growth in CDS software that matches the performance and efficiency principles of such techniques.

Chiang: Within proteomics analyses, there is a need for an integrated software workflow that can identify peptides/proteins, identify their post-translational modifications (particularly phosphorylation), and quantify the samples. All the pieces are there, but the integration can be very tricky, especially for high-sensitivity, statistically solid analyses. Hedge fund IT serves as an excellent model for how an efficient proteomics data analysis works. On one side is the high-value trader, the George Soros types of people dealing with high-value information, who use deep expertise in currency exchanges with the data analysis to discover new potential trades. On the other side are large computer servers and storage systems that run semi-custom software to slice and dice the data in different ways to find the billion-dollar hidden trend. In the middle are informatics specialists (who by the way are not “programmers” in the same way that “linguists” are not “writers”) who bridge the two sides by scripting new visualization or data presentation routines. Here, you can bet that the trader focuses on the trading and does not waste time installing software and maintaining computers.

Long: Both GC and LC instrumentation are producing faster chromatography such that mixtures separate faster with improved peak shape. High-speed LC, a result of the commercialization of dedicated LC systems capable of dealing with the higher back pressure required for small particle (sub-2-µm) LC columns, can provide faster separations and superior separation efficiency compared to conventional LC and is projected to be the fastest growing segment of LC over the next five years. Correspondingly, CDS software to adequately accommodate the new instrumentation must keep pace.

What obstacles stand in the way of software development?

Owen:Regulatory compliance, training of employees, and end-users affect the acceptance of newer technologies. The software technology lifecycle is outpacing the analytical industries ability to absorb necessary changes. Hardware lifecycles are much longer than typical PC lifecycles, support of mature solutions is mandatory in the lab, so managing older data and the original data systems must be considered and supported. Some vendors resist trends and try to maintain proprietary linkages to protect their business. User investment in "legacy" software and the high cost of migration are barriers to commercial adoption of new software.

Quinn: With 64-bit computing gathering momentum, overcoming the transition and ensuring compatibility with the latest .Net framework is essential for software developers. One of the main drawbacks in the past to running a 64-bit system was a lack of 64-bit drivers to make all connected hardware work properly. This places a large burden on software developers to ensure compatibility of their own drivers and programs and that of any associated third-party drivers and programs.

Chiang: The biggest problem facing software is the wide variation in chromatography and chemistry that require semi-customization for correct analyses. A flexible platform, with the ability for semi-custom analysis, will be important. The nature of the data analysis needed for proteomics is changing, as it becomes more akin to hedge fund data mining than an administrative assistant running an Excel spreadsheet. This is especially true for quantitation and ETD data analyses where the field has not settled onto a de facto one-size-fits-all methodology and where some semi-customization of the analysis to query and adapt to a particular data-set will be necessary. This is why the large-scale SILAC papers are always done by research groups with their own bioinformatics resource and why just about any off-the-shelf software you can download or buy will probably not work well for your needs without some customization.

Long: As all of these developments in CDS are continuing on an instrumentation level, the overall deployments of LC and GC in multinational, multisite operations make it essential that CDS be capable of supporting chromatography operations for local environments with remote system support. Rapid growth in the far east and Latin/South American countries for LC and GC instrumentation is expected over the next several years, which will greatly impact commercial CDS developments. While most CDS vendors are accustomed to servicing North American and European markets, with the anticipated growth in Asia and South America, their challenge now is to also provide a product that addresses specific end-user needs in these geographic areas through localized language support, product training, and local personnel support. Many chromatography and CDS vendors, including my company, are also evolving the means by which training and education can accompany the deployment. The use of “electronic” learning or web-delivered computer instruction, specific to the software, may prove to be the ideal platform to uniformly and professionally train CDS users in these environments.

Global companies expanding their operations in these regions either by building their own facilities or engaging in collaborative business ventures will also need to deploy uniform, scalable software solutions like a CDS to ensure smooth data and information sharing. Scalable CDS systems that flexibly allow for different deployments with remote system management and support enable this growth in global operations far more easily than early generation CDS. In the same way that multi-national businesses deploy business software applications throughout their sites but support them through specialized remote management, CDS in the business environment will also need to further adapt and support such capabilities. Finally, the CDS of the future will be developed within the broader framework of supporting an expansive and ever growing multi-vendor hardware universe with less emphasis on proprietary collaborations.

What software developments do you expect to see in the future?

Owen:Integration and collaboration across vendors in the analytical and life science space, tools to manage and review data to handle the explosion of data and quickly take data to decisions. Open system (plug-and-play architecture) enabling ease of integration into customers' IT environments and is flexible enough to accommodate customer workflows. Movement from islands of instrument-based laboratory information to a workflow-based integrated enterprise view, which drives re-use of data to get to decisions and discovery faster.

Quinn:My expectation is to see intelligent automation in areas such as method development, diagnostics, and troubleshooting. In order to increase the ease of development and generate better quality data, automation is an important step in achieving these goals. To ensure minimum downtime, better diagnostics and troubleshooting will also be developed in line with the automation.

Chiang: The evolution of software products from point solutions to a robust, server-class platform with flexibility for customization will be the key. Discovery proteomics should have the biologist focusing on the science and let the specialized proteomics IT experts handle the backroom servers and integrated storage systems. This is in stark contrast to most proteomics labs today, where the same few people are trying to run big experiments using little PCs and wasting time managing computers rather than doing science. The combined “computing appliance plus support” model is an excellent fit for translational proteomics, allowing scientists operating million-dollar mass specs to focus on the science, and companies such as mine to handle the backroom proteomics IT analysis and storage, and workflow customization.

If you are interested in participating in any upcoming Technology Forums please contact Associate Editor Meg Evans for more information. Next month’s forums will focus on the HPLC and Green Chemistry markets.