Oil and gas operators are drowning in seismic data, and their ability to manage, access and collaborate on that data is of huge strategic importance. This is because improving those data metrics directly impacts reducing their “time-to-oil” — the time it takes to identify oil assets and extract in the most efficient way. It’s no exaggeration to say that having the wrong platform can cost millions of dollars a day if data is unavailable or accessible to a global resource pool. And, the faster they can interpret the data, the bigger their competitive advantage.
But, these data sets — often collected in extreme, remote environments — are enormous. As technology continues to provide ever more detailed information on geologic structures, these files are growing exponentially in both size and number. Many operators have petabytes of data, and individual file volumes can be hundreds of terabytes in size.
Unfortunately, oil and gas companies often rely on an infrastructure that was not designed to cope with this gargantuan amount of data. Transmitting this data can be slow, and it can often be subject to frequent interruptions due to expensive and constricted wide-area network (WAN) lines. The local collection site is often remote and hazardous, so it’s not conducive to storing data there.
The slow rate of transfer combined with the scale of seismic data also make collaborating on these data sets difficult. Often, the operators send the field data to a regional office equipped with powerful workstations that can run the seismic interpretation software geoscientists use in their analyses. The process of moving data from collection to the point at which the information can be acted upon can often take many more weeks. And once analysis is finally complete, there’s the problem of sharing the results. It’s common for companies to ship multi-terabyte hard drives around the world because the networks are far too slow.
Because it’s so slow to share and collaborate on these very large data sets, it’s difficult for multiple geoscientists to review the data and results, opening up more potential for error. And when it comes to choosing a drill site, these errors can be extremely costly.
There are additional challenges beyond transmission and collaboration. The sheer size and exponential growth of seismic data make it very challenging to manage, protect and connect to analytics. The lengthy seismic interpretation lifecycle slows down time-to-oil because the laborious task of collecting, storing, copying, sharing, analyzing and managing seismic data can take up to 18 months or more. And that’s assuming these antiquated systems — engineered for an era that enterprises in other industries have long since left behind — are running perfectly. Traditional, file-storage systems keep information in discrete, local silos, making it costly and cumbersome to consolidate the data and extract its full value. This siloed approach also opens up huge risks from ransomware and cyberthreats as companies struggle to implement robust recovery practices on tens or hundreds of global locations where the infrastructure and local support vary considerably.
Much of this time can be chalked up to the inefficiencies of the process. Any improvement in the efficiency of seismic interpretation activities could significantly reduce time-to-oil, which realizes profits faster and reduces costs. In these uncertain times, that’s a huge competitive advantage.
Enter the cloud
Cloud-based file storage offers a new model for managing, sharing and protecting seismic data sets. Like many large organizations, oil and gas companies are moving workloads to the cloud, but these are primarily computed workloads. File data has largely remained in traditional, on-premises network-attached storage (NAS) systems.
But when cloud-based file storage is synchronized with a virtual desktop infrastructure (VDI), massive seismic data files are instantly accessible to an operator’s data scientists no matter where they are located. VDI isn’t the only option, however. If an operator would prefer to use high-powered, local workstations, the current dataset can be cached locally, and all changes can be sent to the cloud and then propagated back out to other users around the world. In both cases, the master copy of the seismic data set resides in the cloud.
Of course, it’s not quite as simple as chucking file data into raw cloud storage and pointing workstations or desktop images to the cloud buckets. Data files need to be orchestrated across locations and made accessible in a user-friendly way. To avoid latency issues that would arise from accessing massive volumes of data across long distances, cloud storage should be placed in a hub with frequently accessed data near end-users in regional spokes. These regional caches — called edge appliances — enable users to access seismic data at fast, local area network (LAN) speeds. Using this architecture, a file-services platform can synchronize data across any number of locations while providing the proverbial “single pane of glass” for managing an otherwise unwieldy global file infrastructure.
Having the master data set consolidated in the cloud makes a number of other functions far easier and less expensive:
- Analytics: Seismic data can be easily connected to powerful analytics software, either run by the operator in a cloud instance or via the cloud provider’s own analytics services.
- Accessibility and collaboration: The cloud is accessible from anywhere, and it makes collaboration simple, making it unnecessary to ship data or fly specialists from location to location.
- Protection: Modern hyper-scale cloud providers have built an enormously redundant architecture, with multiple copies of all data in several locations. Additionally, object storage is extremely durable and can be used in a “write once read many” (WORM) mode. This means that the data itself is immutable and cannot be changed, rendering it impervious to ransomware attacks. However, since file data typically must change, cloud file storage providers use storage snapshots to capture changes to the data. As a result, if data is corrupted, all a provider needs to do is roll back to an earlier version, which could be as little as five minutes old.
As a result, operators get unlimited capacity at the touch of a button, global collaboration and automatic data protection, all without investing in or managing storage equipment. With seismic data flowing effortlessly and global specialists able to access it from anywhere, operators can extract the full value of their seismic data, which means they can move far more quickly to extracting valuable oil and gas.
About the Author: Russ Kennedy is the chief product officer at Nasuni and has more than 25 years of experience developing software and hardware solutions to address exponential data growth. Before Nasuni, Russ directed product strategy at private cloud object storage pioneer Cleversafe through its $1.3 billion acquisition by IBM.
An avid cyclist and hiker, Russ resides in Boulder, Colorado, with his family. He has a BS degree in Computer Science from Colorado State University and an MBA degree from the University of Colorado.