• Skip to main content
  • Skip to footer

INT

Empowering Visualization

CONTACT US SUPPORT
MENUMENU
  • Products
    • Overview
    • IVAAP™
    • INTViewer™
    • GeoToolkit™
    • Product Overview
  • Demos
    • GeoToolkit Demos
    • IVAAP Demos
  • Success Stories
  • Solutions
    • Overview
    • For E&P
    • For OSDU Visualization
    • For Cloud Partners
    • For Machine Learning
    • For CCUS
    • For Geothermal Energy
    • For Wind Energy
    • For Enterprise
    • Tools for Developers
    • Services Overview
  • Resources
    • Blog
    • Developer Community
    • FAQ
    • INT Resources Library
  • About
    • Overview
    • News
    • Events
    • Careers
    • Meet Our Team
    • About INT

ivaap

Jan 12 2021

Comparing Storage APIs from Amazon, Microsoft and Google Clouds

One of the unique capabilities of IVAAP is that it works with the cloud infrastructure of multiple vendors. Whether your SEGY file is posted on Microsoft Azure Blob Storage, Amazon S3 or Google Cloud Storage, IVAAP will be capable of visualizing it.

It’s only when administrators register new connectors that vendor-specific details need to be entered.  For all other users, the user interface will be identical regardless of the data source. The REST API consumed by IVAAP’s HTML5 client is common to all connectors as well. The key component that does the hard work of “speaking the language of each cloud vendor and hiding their details to the other components” is the IVAAP Data Backend.

While the concept of “storage in the cloud” is similar across all three vendors, they each provide a different API to achieve similar goals. In this article, we will compare how to implement 4 basic functionalities. Because the IVAAP Data Backend is written in Java, we’ll only compare Java APIs.

 

Checking that an Object or Blob Exists

Amazon S3

String awsAccessKey = …
String awsSecretKey = …
String region = …
String bucketName = …
String keyName = …
AwsCredentials credentials = AwsBasicCredentials.create(awsAccessKey, awsSecretKey);
S3Client s3Client = S3Client.builder().credentialsProvider(credentials).region(region).build();
try {
    HeadObjectRequest.Builder builder = HeadObjectRequest.builder().bucket(bucketName).key(keyName);
    s3Client.headObject(request);
    return true;
} catch (NoSuchKeyException e) {
    return false;
}

Microsoft Azure Blob Storage

String accountName = …
String accountKey = …
String containerName = …
String blobName = ...
StorageSharedKeyCredential credential = new StorageSharedKeyCredential(accountName, accountKey);
String endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName);
BlobServiceClientBuilder builder = new BlobServiceClientBuilder().endpoint(endpoint).credential(credential);
BlobServiceClient client = builder.buildClient();
BlobContainerClient containerClient = client.getBlobContainerClient(containerName);
BlobClient blobClient = containerClient.getBlobClient(blobName);
return blob.exists();

Google Cloud Storage

String authKey = …
String projectId = …
String bucketName = …
String blobName = ...
ObjectMapper mapper = new ObjectMapper();
JsonNode node = mapper.readTree(authKey);
ByteArrayInputStream in = new ByteArrayInputStream(mapper.writeValueAsBytes(node));
GoogleCredentials credentials = GoogleCredentials.fromStream(in);
Storage storage = StorageOptions.newBuilder().setCredentials(credentials)
                        .setProjectId(projectId)
                        .build()
                        .getService();
Blob blob = storage.getBlob(bucketName, blobName, BlobGetOption.fields(BlobField.ID));
return blob.exists();

 

Getting the Last Modification Date of an Object or Blob

Amazon S3

String awsAccessKey = …
String awsSecretKey = …
String region = …
String bucketName = …
String keyName = …
AwsCredentials credentials = AwsBasicCredentials.create(awsAccessKey, awsSecretKey);
S3Client s3Client = S3Client.builder().credentialsProvider(credentials).region(region).build();
HeadObjectRequest headObjectRequest = HeadObjectRequest.builder()
.bucket(bucketName)
.key(keyName)
.build();
HeadObjectResponse headObjectResponse = s3Client.headObject(headObjectRequest);
return headObjectResponse.lastModified();

Microsoft Azure Blob Storage

String accountName = …
String accountKey = …
String containerName = …
String blobName = …
StorageSharedKeyCredential credential = new StorageSharedKeyCredential(accountName, accountKey);
String endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName);
BlobServiceClientBuilder builder = new BlobServiceClientBuilder()
.endpoint(endpoint)
.credential(credential);
BlobServiceClient client = builder.buildClient();
BlobClient blob = client.getBlobClient(containerName, blobName);            BlobProperties properties = blob.getProperties();
return properties.getLastModified();

Google Cloud Storage

String authKey = …
String projectId = …
String bucketName = …
String blobName = …
ObjectMapper mapper = new ObjectMapper();
JsonNode node = mapper.readTree(authKey);
ByteArrayInputStream in = new ByteArrayInputStream(mapper.writeValueAsBytes(node));
GoogleCredentials credentials = GoogleCredentials.fromStream(in);
Storage storage = StorageOptions.newBuilder().setCredentials(credentials)
                        .setProjectId(projectId)
                        .build()
                        .getService();
Blob blob = storage.get(bucketName, blobName,  BlobGetOption.fields(Storage.BlobField.UPDATED));
return blob.getUpdateTime();

 

Getting an Input Stream out of an Object or Blob

Amazon S3

String awsAccessKey = …
String awsSecretKey = …
String region = …
String bucketName = …
String keyName = …
AwsCredentials credentials = AwsBasicCredentials.create(awsAccessKey, awsSecretKey);
S3Client s3Client = S3Client.builder().credentialsProvider(credentials).region(region).build();
GetObjectRequest getObjectRequest = GetObjectRequest.builder()
.bucket(bucketName)
.key(keyName)
.build();
return s3Client.getObject(getObjectRequest);

Microsoft Azure Blob Storage

String accountName = …
String accountKey = …
String containerName = …
String blobName = …
StorageSharedKeyCredential credential = new StorageSharedKeyCredential(accountName, accountKey);
String endpoint = String.format(Locale.ROOT, "https://%s.blob.core.windows.net", accountName);
BlobServiceClientBuilder builder = new BlobServiceClientBuilder()
.endpoint(endpoint)
.credential(credential);
BlobServiceClient client = builder.buildClient();
BlobClient blob = client.getBlobClient(containerName, blobName);
return blob.openInputStream();

Google Cloud Storage

String authKey = …
String projectId = …
String bucketName = …
String blobName = …
ObjectMapper mapper = new ObjectMapper();
JsonNode node = mapper.readTree(authKey);
ByteArrayInputStream in = new ByteArrayInputStream(mapper.writeValueAsBytes(node));
GoogleCredentials credentials = GoogleCredentials.fromStream(in);
Storage storage = StorageOptions.newBuilder().setCredentials(credentials)
                        .setProjectId(projectId)
                        .build()
                        .getService();
Blob blob = storage.get(bucketName, blobName,  BlobGetOption.fields(BlobField.values()));
return Channels.newInputStream(blob.reader());

 

Listing the Objects in a Bucket or Container While Taking into Account Folder Hierarchies

S3

String awsAccessKey = …
String awsSecretKey = …
String region = …
String bucketName = …
String parentFolderPath = ...
AwsCredentials credentials = AwsBasicCredentials.create(awsAccessKey, awsSecretKey);
S3Client s3Client = S3Client.builder().credentialsProvider(credentials).region(region).build();
ListObjectsV2Request.Builder builder = ListObjectsV2Request.builder().bucket(bucketName).delimiter("/").prefix(parentFolderPath + "/");
ListObjectsV2Request request = builder.build();
ListObjectsV2Iterable paginator = s3Client.listObjectsV2Paginator(request);
Iterator<CommonPrefix> foldersIterator = paginator.commonPrefixes().iterator();
while (foldersIterator.hasNext()) {
…
}

Microsoft

String accountName = …
String accountKey = …
String containerName = …
String parentFolderPath = ...
StorageSharedKeyCredential credential = new StorageSharedKeyCredential(accountName, accountKey);
BlobServiceClientBuilder builder = new BlobServiceClientBuilder()
.endpoint(endpoint)
.credential(credential);
BlobServiceClient client = builder.buildClient();
BlobContainerClient containerClient = client.getBlobContainerClient(containerName);
Iterable<BlobItem> iterable = containerClient.listBlobsByHierarchy(parentFolderPath + "/");
for (BlobItem currentItem : iterable) {
   …
}

Google

String authKey = …
String projectId = …
String bucketName = …
String parentFolderPath = ...
ObjectMapper mapper = new ObjectMapper();
JsonNode node = mapper.readTree(authKey);
ByteArrayInputStream in = new ByteArrayInputStream(mapper.writeValueAsBytes(node));
GoogleCredentials credentials = GoogleCredentials.fromStream(in);
Storage storage = StorageOptions.newBuilder().setCredentials(credentials)
                        .setProjectId(projectId)
                        .build()
                        .getService();
Page<Blob> blobs = cloudStorage.listBlobs(bucketName, BlobListOption.prefix(parentFolderPath + "/"), BlobListOption.currentDirectory());
for (Blob currentBlob : blobs.iterateAll()) {
 ...
}

 

Most developers will discover these APIs by leveraging their favorite search engine. Driven by innovation and performance, cloud APIs become obsolete quickly. Amazon was the pioneer, and much of the documentation still indexed by Google is for the v1 SDK, while the v2 has been available for more than two years, but wasn’t a complete replacement. This sometimes makes research challenging for the simplest needs. Microsoft has migrated from v8 to v12 a bit more recently and has a similar challenge to overcome. Being the most recent major player, the Google SDK is not dragged down much by obsolete articles.

The second way that developers will discover an API is by using the official documentation. I found that the Microsoft documentation is the most accessible. There is a definite feel that the Microsoft Azure documentation is treated as an important part of the product, with lots of high-quality sample code targeted at beginners.

The third way that developers discover an API is by using their IDE’s code completion. All cloud vendors make heavy use of the builder pattern. The builder pattern is a powerful way to provide options without breaking backward compatibility, but slows down the self-discovery of the API. The Amazon S3 API also stays quite close to the HTTP protocol, using terminology such as “GetRequest” and “HeadRequest”. Microsoft had a higher level API in v8 where you were manipulating blobs. The v12 iteration moved away from apparent simplicity by introducing the concept of blob clients instead. Microsoft offers a refreshing explanation of this transition. Overall, I found that the Google SDK tends to offer simpler APIs for performing simple tasks.

There are more criterias than simplicity, discoverability when comparing APIs. Versatility and performance are two of them. The Amazon S3 Java SDK is probably the most versatile because of the larger number of applications that have used its technology. It even works with S3 clones such as MinIO Object Storage (and so does IVAAP). The space where there are still a lot of changes is asynchronous APIs. Asynchronous APIs tend to offer higher scalability, faster execution, but can only be compared in specific use cases where they are actually needed. IVAAP makes heavy use of asynchronous APIs, especially to visualize seismic data. This would be the subject of another article. This is an area that evolves rapidly and would deserve a more in-depth comparison.

For more information on IVAAP, please visit www.int.com/products/ivaap/

 


Filed Under: IVAAP Tagged With: API, cloud, Google, ivaap, java, Microsoft

Dec 02 2020

IVAAP Release 2.7: More Map Search and ArcGIS Features

IVAAP™ is a subsurface data visualization platform that provides developers and product owners powerful subsurface visualization features for their digital solutions in the cloud. IVAAP enables users to search, access, display, and analyze 2D/3D G&G and petrophysical data in a single user-friendly dashboard on the web. The latest release of IVAAP 2.7 comes with various new features and significant improvements.

Highlights from this release include many advanced search and map capabilities, improved 3D widget filter dialog, new interval curves support, new date/time picker for Cross-Plot widget axis settings, and more! 

Advanced Mapping Capabilities

IVAAP features support for visual-based data discovery using map-based search, and is fully integrated with ArcGIS (ESRI), allowing for the search of structured and unstructured data in a data lake or any other file repository. IVAAP supports a wide range of map formats and services like ArcGIS, GeoJSON, KML, Mapbox, Bing, WMS, and more.

With the ArcGIS integration, you can easily access all layers and details from your ArcGIS server and display them within IVAAP to enrich map-based search of well, seismic, and other subsurface data.

New features include the ability to display a dynamic metadata table for a selected object in the map (Well, Seismic, etc.). 

Most layers are supported. Image services (ArcGIS Image Service Layer, Image Services Vector Layers, and WMS) and Tile services (Image Service, ArcGIS Tiled Map Service Layer, Web Tiled Layer, OpenStreetMap) are supported. Feature Services include map service (ArcGIS Feature Layer), KML, WFS, and CSV. We also provide some real-time services support like stream services (ArcGIS Stream Layer), GeoRSS, Vector Tiles (VectorTileLayer), and Bing Maps services. Two extra formats that are supported are GeoJSON and GPX.

This release includes improved search capabilities allowing search across any metadata for any user. Access to data can be restricted to read-only mode. We also made improvements to fence highlighting and access to well lists, and labels can now be saved with the dashboard.

 

2_7_0_Map Reduced

 

More Themes Control 

Previously, theme control within IVAAP was a bit limited. We’ve expanded the theme mechanism to all widgets so that users can customize themes with more control and options and access new updated themes.

 

2_7_0_Customized Themes
2_7_0_Customized Themes

Lightmode theme

 

New Image Widget 

IVAAP can now display simple image files such as jpeg or png files in the image widget, align, and zoom in to show detail. This feature allows users to customize dashboards with logos or other image files needed to display. 

Improvements in 3D

IVAAP’s 3D widget now supports tagging and aliases when displaying well data. We added reservoir data that can be serialized in the dashboard template. Users now have the ability to mix data with CRS and data without CRS. Another new feature is that users can apply properties to multiple or individual objects. And we’ve improved the dashboard restoration of multiple inlines, crossbones, and time slices. Finally, the 3D widget filter dialog has been redesigned.

 

2_7_0_3D Reduced

 

New Features in WellLog

For WellLog, we improved the set main index support for templates, dashboards, and well switching. With this improvement, a secondary index can be used to display data into a different index and secondary indexes can be restored when opening an existing dashboard or template. An improved automatic logarithmic mode gives users the ability to add a curve to a logarithmic track. New features added to WellLog include: the ability to automatically rotate labels for lithology and a reset action where users can right-click with the option to clear their display.

 

WellLog

 

New in Schematics

For the Schematics package, new features include: perforation with state definition support, the ability to customize by using a filter dialog, and the ability to use cursor tracking between Schematics and WellLog. We also improved the component selection support in the Schematics widget.

 

schematics 2.7

 

Time Series: Annotations and Perforations 

We improved the ability to select a data series from the legend. The Time Series widget now features support for annotations and perforations. The tooltip now shows the index data and time.

 

TimeSeries

 

New and Improved Line Chart

The IVAAP line chart now supports templates and data series dialog. We improved the ability to edit existing data series. Users can now flip the axis for date and time data, and the legend has been improved to show or hide the data series parent. There is also an improvement for single data sets, multi-data sets, and multi parent projects. 

 

LineChart

 

This release includes many more improvements to features and to the UI. For more information, check out the full release here.

Or check out int.com/ivaap for a preview of IVAAP or for more information about INT’s other data visualization products, please visit www.int.com or contact us at intinfo@int.com.


Filed Under: IVAAP, Uncategorized Tagged With: 3D, annotations, arcgis, CRS, ivaap, line chart, mapping, schematics, time series, welllog

Nov 20 2020

A New Era in O&G: Critical Components of Bringing Subsurface Data to the Cloud

The oil and gas industry is historically one of the first industries generating actionable data in the modern sense. For example, the first seismic imaging was done in 1932 by John Karcher.

 

first-seismic
Seismic dataset in 1932.

 

Since that first primitive image, seismic data has been digitized and has grown exponentially in size. It is usually represented in monolith data sets which may span in size from a couple of gigabytes to petabytes if pre-stack. 

seismic-faults
Seismic datasets today.

 

The long history, large amount of data, and the nature of the data pose unique challenges that often make it difficult to take advantage of advancing cloud technology. Here is a high-level overview of the challenges of working with oil and gas data and some possible solutions to help companies take advantage of the latest cloud technologies. 

Problems with Current Data Management Systems

Oil and Gas companies are truly global companies, and the data is often distributed among multiple disconnected systems in multiple locations. This not only makes it difficult to find and retrieve data when necessary but also makes it difficult to know what data is available and how useful it is. This often requires person-to-person communication, and some data may even be in offline systems or on someone’s desk.

The glue between those systems is data managers who are amazing at what they do but still introduce a human factor to the process. They have to understand which dataset is being requested, then search for it on various systems, and finally deliver it to the original requester. How much does this process take? You guessed it—way too much! And in the end, the requester may realize that it’s not the data they were hoping to get, and the whole process is back to square one.

After the interpretation and exploration process, decisions are usually made on the basis of data screenshots and cherry-picked views, which limit the ability of specialists to make informed decisions. Making bad decisions based on incomplete or limited data can be very expensive. This problem would not exist if the data was easily accessible in real-time. 

And that doesn’t even factor in collaboration between teams and countries. 

How can O&G and service companies manage
their massive subsurface datasets better
by leveraging modern cloud technologies?

3 Key Components of Subsurface Data Lake Implementation

There are three critical components of a successful subsurface data lake implementation: a strong cloud infrastructure, a common data standard, and robust analysis and visualization capabilities. 

 

3-key-components

 

AWS: Massive Cloud Architecture

While IVAAP is compatible with any cloud provider—along with on-premise and hybrid installations—AWS offers a strong distributed cloud infrastructure, reliable storage, compute, and more than 150 other services to empower cloud workflows. 

OSDU: Standardizing Data for the Cloud

The OSDU Forum is an Energy Industry Forum formed to establish an open subsurface Reference Architecture, including a cloud-native subsurface data platform reference architecture, with usable implementations for major cloud providers. It includes Application Standards (APIs) to ensure that all applications (microservices), developed by various parties, can run on any OSDU data platform, and it leverages Industry Data Standards for frictionless integration and data access. The goal of OSDU is to bring all existing formats and standards under one umbrella which can be used by everyone, while still supporting legacy applications and workflows. 

IVAAP: Empowering Data Visualization

A data visualization and analysis platform such as IVAAP, which is the third key component to a successful data lake implementation, provides industry-leading tools for data discovery, visualization, and collaboration. IVAAP also offers integrations with various Machine Learning and artificial intelligence workflows, enabling novel ways of working with data in the cloud.

ivaap-benefits

 

Modern Visualization — The Front End to Your Data

To visualize seismic data, as well as other types of data, in the cloud, INT has developed a native web visualization platform called IVAAP. IVAAP consists of a front-end client application as well as a backend. The backend takes care of accessing, reading, and preparing data for visualization. The client application provides a set of widgets and UI components empowering search, visualization, and collaboration for its users. The data reading and other low-level functions are abstracted from the client by a Domain API, and work through connector microservices on the backend. To provide support for a new data type, you only need to create a new connector. Both parts provide an SDK for developers, and some other perks as well. 

Compute Close to Your Data

Once the data is in the cloud, a variety of services become available. For example, one of them is ElasticSearch from AWS, which helps index the data and provides a search interface. Another service that becomes available is AWS EC2, which provides compute resources that are as distributed as the data is. That’s where IVAAP gets installed.

One of the cloud computing principles is that data has a lot of gravity and all the computing parts tend to get closer to it. This means that it is better to place the processing computer as close to the data as possible. With AWS EC2, we at INT can place our back end very close to the data, regardless of where it is in the world, minimizing latency for the user and enabling on-demand access. Elastic compute resources also enable us to scale up when the usage increases and down when fewer users are active.

 

AWS-INT

All of this works together to make your data on-demand—when the data needs to be presented, all the tools and technologies mentioned above come into play, visualizing the necessary data in minutes, or even seconds, with IVAAP dashboards and templates. And of course, the entire setup is secure on every level. 

Empower Search and Discovery

The next step is to make use of this data. And to do so, we need to provide users a way to discover it. What should be made searchable, how to set up a search, and how to expose the search to the users? 

Since searching through numerical values of the data won’t provide a lot of discovery potential, we need some additional metadata. This metadata is extracted along with the data and also uploaded to the cloud. All of it or a subset of metadata is then indexed using AWS Elasticsearch. IVAAP uses an Elasticsearch connector to the search, as well as tools to invoke the search through an interactive map interface or filter forms presented to the user.

How can you optimize web performance of massive domain datasets?

Visualizing Seismic Datasets on the Web

There are two very different approaches to visualizing data. One is to do it on the server and send rendered images to the client. This process lacks interactivity, which limits the decisions that can be made from those views. The other option is to send data to the client and visualize it on the user’s machine. IVAAP implements either approach. 

While the preferred method—sending data to the client’s machine—provides limitless interactivity and responsiveness of the visuals, it also poses a special challenge: the data is just too big. Transferring terabytes of data from the server to the user would mean serious problems. So how do we solve this challenge? 

First, it is important to understand that not all the data is always visible. We can calculate which part of the data is visible on the user’s screen at any given moment and only request that part. Some of the newer data formats are designed to operate with such reads and provide ways to do chunk reads out of the box. A lot of legacy data formats—for example, SEG-Y—are often unstructured. To properly calculate and read the location of the desired chunk, we need to first have a map—called an Index—that is used to calculate the offset and the size of chunks to be read. Even then, the data might still be too large. 

Luckily, we don’t always need the whole resolution. If a user’s screen is 3,000 pixels wide, they won’t be able to display all 6,000 traces, so we can then adaptively decrease the number of traces to provide for optimal performance.

reduce-pixels

Often the chunks which we read are in different places in the file, making it necessary to do multiple reads at the same time. Luckily, both S3 storage and IVAAP support such behavior. We can fire off thousands of requests in parallel, maximizing the efficiency of the network. Live it to the full, as some people like to say. And even then, once the traces are picked and ready to ship, we do some vectorized compression before shipping the data to the client. 

We were talking about legacy file formats here, but it’s worth mentioning that GPU compression is also available for newer file formats like VDS/OpenVDS and ZGY/OpenZGY. It’s worth mentioning that the newer formats provide perks like brick storage format, random access patterns, adaptive level of detail, and more.

Once the data reaches the client, JavaScript and Web Assembly technologies come together to decompress the data. The data is then presented to the user using the same technologies through some beautiful widgets, providing interactivity and a lot of control. From there, building a dashboard—drilling, production monitoring, exploration, etc.—with live data takes minutes.

All the mentioned processes are automated and require minimal human management. With all the work mentioned above, we enable a user to search for the data of interest, add it to desired visualization widgets (multiple are available for each type of data), and display on their screen with a set of interactive tools to manipulate the visuals. All within minutes, and while being in their home office. 

That’s not all—a user can save the visualizations and data states into a dashboard and share it with their colleagues sitting on a different continent, who can then open the exact same view in a matter of minutes. With more teams working remotely, this seamless collaboration helps facilitate collaboration and reduce data redundancy and errors. 

dash

Data Security

How do we keep this data secure? There are two layers of authentication and authorization implemented in such a system. First, AWS S3 has identity-based access guarantees that data can be visible to only authorized requests. IVAAP uses OAuth2 integrated with AWS Cognito to authenticate the user and authorize the requests. The user logs into the application and gets a couple of tokens that allow them to communicate with IVAAP services. The client passes tokens back to the IVAAP server. In the back end, IVAAP validates the same tokens with AWS Cognito whenever data reads need to happen. When validated, a new, temporary signed access token is issued by S3, which IVAAP uses to make the read from the file in a bucket.

Takeaways

Moving to the cloud isn’t a very simple task and poses a lot of challenges. By using technology provided by AWS and INT’s IVAAP and underlined by OSDU data standardization, we can create a low-latency data QC and visualization system which puts all the data into one place, provides tools to search for data of interest, enables real-time on-demand access to the data from any location with the Internet, and does all that in a secure manner.

For more information on IVAAP, please visit int.com/ivaap/ or to learn more about how INT works with AWS to facilitate subsurface data visualization, check out our webinar, “A New Era in O&G: Critical Components of Bringing Subsurface Data to the Cloud.”


Filed Under: IVAAP Tagged With: AWS, cloud, data visualization, digital transformation, ivaap, subsurface data visualization

Nov 09 2020

Human Friendly Error Handling in the IVAAP Data Backend

As the use cases of IVAAP grow, the implementation of the data backend evolves. Past releases of IVAAP have been focused on providing data portals to our customers. Since then, a new use case has appeared where IVAAP is used to validate the injection of data in the cloud. Both use cases have a lot in common, but they differ in the way errors should be handled.

In a portal, when a dataset fails to load, the reason why needs to stay “hidden” to end-users. The inner workings of the portal and its data storage mechanisms should not be exposed as they are irrelevant to the user trying to open a new dataset. When IVAAP is used to validate the results of an injection workflow, many more details about where the data is and how it failed to load need to be communicated. And these details should be expressed in a human friendly way.

To illustrate the difference between a human-friendly message and a non-human friendly message, let’s take the hypothetical case where a fault file should have been posted as an object in Amazon S3,… but the upload part of the ingestion workflow failed for some reason. When trying to open that dataset, the Amazon SDK would report this low-level error: “The specified key does not exist. (Service S3, Status Code: 404, Request ID: XXXXXX)”. In the context of an ingestion workflow, a more human friendly message would be “This fault dataset is backed by a file that is either missing or inaccessible.”

 

Screen Shot 2020-11-05 at 4.28.12 PM

 

The IVAAP Data Backend is written in Java. This language has a built-in way to handle errors, so a developer’s first instinct is to use this mechanism to pass human friendly messages back to end users. However, this approach is not as practical as it seems.  The Java language doesn’t make a distinction between human-friendly error messages and low-level error messages such as the one sent by the Amazon SDK, meant to be read only by developers. Essentially, to differentiate them, we would need to create a HumanFriendlyException class, and use this class in all places where an error with a human-friendly explanation is available.

This approach is difficult to scale to a large body of code like IVAAP’s. And the IVAAP Data Backend is not just code, it also comes with a large set of third-party libraries that have their own idea of how to communicate errors. To make matters worse, It’s very common for developers to do this:

 

    try {

             // do something here

        } catch (Exception ex) {

            throw new RuntimeException(ex);

        }

 

This handling wraps the exception, making it difficult to catch by the caller. A “better” implementation would be:

 

     

try {

             // do something here

       } catch (HumanFriendlyException ex) {

            throw ex;

        } catch (Exception ex) {

            throw new RuntimeException(ex);

        }

 

While this is possible to enforce this style for the entirety of IVAAP’s code, you can’t do this for third party libraries calling IVAAP’s code.

Another issue with Java exceptions is that they tend to occur at a low-level, where very little context is known. If a service needs to read a local file, a message “Can’t read file abc.txt” will only be relevant to end users if the primary function of the service call was to read that file. If reading this file was only accessory to the service completion, bubbling up an exception about this file all the way to the end-user will not help.

To provide human-friendly error messages, IVAAP uses a layered approach instead:

  • High level code that catches exceptions reports these exceptions with a human friendly message to a specific logging system
  • When exceptions are thrown in low level code, issues that can expressed in a human friendly way are also reported to that same logging system

With this layered approach where there is a high-level “catch all”, IVAAP is likely to return relevant human friendly errors for most service calls. And the quality of the message improves as more low-level logging is added. This continuous improvement effort is more practical than a pure exception-based architecture because it can be done without having to refactor how/when Java exceptions are thrown or caught. 

To summarize, the architecture of IVAAP avoids using Java exceptions when human-friendly error messages can be communicated. But this is not just an architecture where human-friendly errors use an alternate path to bubble up all the way to the user. It has some reactive elements to it.

For example, if a user calls a backend service to access a dataset, and this dataset fails to load, a 404/Not Found HTTP status code is sent by default with no further details. However, if a human friendly error was issued during the execution of this service, the status code changes to 500/Internal Server Error, and the content of the human friendly message is included in the JSON output of this service. This content is then picked up by the HTML5 client to show to the user. I call this approach “reactive” because unlike a classic logging system, the presence of logs modifies the visible behavior of the service.

With the 2.7 release of IVAAP, we created two categories of human friendly logs. One is connectivity. When a human friendly connectivity log is present, 404/Not Found errors and empty collections are reported with a 500/Internal Server Error HTTP status code. The other is entitlement. When a human friendly entitlement log is present, 404/Not Found errors and empty collections are reported with a 403/Forbidden HTTP status code.

The overall decision on which error message to show to users belongs to the front-end. Only the front-end knows the full context of the task a user is performing. The error handling in the IVAAP Data Backend provides a sane default that the viewer can elect to use, depending on context. OSDU is one of the deployments where the error handling of the data backend is key to the user experience. The OSDU platform has ingestion workflows outside of IVAAP, and with the error reporting capabilities introduced in 2.7, IVAAP becomes a much more effective tool to QA the results of these workflows.

For more information on INT’s newest platform, IVAAP, please visit www.int.com/products/ivaap/

 


Filed Under: IVAAP Tagged With: data, ivaap, java, SDK

Nov 06 2020

How to Get the Best Performance out of Your Seismic Web Applications

One of the most challenging data management problems faced in the industry is with seismic files. Some oil and gas companies estimate that they acquire a petabyte of data per day or more. Domain knowledge and specific approaches are required to move, access, and visualize that data.

In this blog post, we will dive deep into the details of modern technology that can be useful to achieve speed up. We will also cover: common challenges around seismic visualization, how INT helps solve these challenges with advanced compression and decompression techniques, how INT uses vectorization to speed up compression, and more.

What Is IVAAP?

IVAAP is a data visualization platform that accelerates the delivery of cloud-enabled geoscience, drilling, and production solutions.

  • IVAAP Client offers flexible dashboards, 2D & 3D widgets, sessions, and templates
  • IVAAP Server side connects to multiple data sources, integrates with your workflows, and offers real-time services
  • IVAAP Admin client manages user access and projects

Screen Shot 2020-11-04 at 2.37.45 PM

 

Server – Client Interaction

Interaction occurs when the client requests a file to display from the server, the server returns the file lists, the user chooses a file to display, and then the server starts sending chunks of data while it displays this data.

Screen Shot 2020-11-04 at 2.42.21 PM

Some issues encountered with this scheme include:

  • Seismic data files are huge in size — they can be hundreds of gigabytes or even terabytes.
  • Because of the file size, it takes too much time to transfer files via network.
  • The network can have too much bandwidth.

The goals of this scheme are to:

  • Speed up file transfer time
  • Reduce data size for transfer
  • Add user controls for different network bandwidth

And the solution:

  • We decided to implement server-side compression and client-side decompression. We also decided to provide the client parameter that we call acceptable error level after the seismic data file compression/decompression process.

Screen Shot 2020-11-04 at 2.43.13 PM

 

By taking a closer look at compression and decompression data, we can see that the original seismic data goes through a set of five transformations — AGC, Normalization, Hear Wavelets, Quantization, and Huffman. As a result of this transformation, we get a compressed file that can be sent to clients via network. And on the client’s side, there is a decompression process that goes in different directions — from inverse Huffman to inverse AGC. This is the way that clients get original data. It does not get precise, original data. But it gets data after the compression and decompression process. That’s why we added an acceptable error level after the compression and decompression process. This is because we have different scenarios where clients don’t always require the full original data with the full level of precision. For example, sometimes the client only needs to review the seismic data. So using this acceptable error level, they can control how much data will be passed by a network and, of course, speed up this process. 

The resulting scheme looks like this:

Screen Shot 2020-11-04 at 2.44.32 PM

 

The client requests a file list from the server, the user chooses a file to display, and then the server starts sending the data and compresses it. The server then sends it to the client, the client decompresses, and finally, it displays the data. This is repeated for each tile to display.

So why not use any other existing compression, like GZIP, LZ Deflate, etc.? We tried these compressions, but we found out that this type of compression is not as effective as we’d like it to be on our seismic data.

Server-Side Interaction

The primary objective was to speed up the current implementation of compression and decompression on both the server and client side.

The proposal:

  • Server-side compression is implemented in Java, so we decided to create C++ implementation of compression sequence and use JNI layer to call native methods. For the client-side decompression, we implemented in JavaScript to create C++ implementation of decompression and use WebAssembly (WASM) proposal for integrating C++ code into JS.
  • We implemented both compression and decompression algorithms in C++, but after comparing the results and performance of C++ and Java, we discovered that C++ was just 1.5 times faster than “warmed up JVM”. That’s why we decided to move on and apply SIMD instructions for further speedup.

Single Instruction Multiple Data (SIMD)

Screen Shot 2020-11-04 at 2.51.30 PM

SIMD architecture performs the same operation on multiple data elements in parallel. For Scalar operation, you have to perform four separate calculations to get the right result. For SIMD operations, you apply one vector value calculation to get the correct result.

SIMD benefits:

  • Allows processing of several data values with one single instruction.
  • Much faster computation on predefined computation patterns.

SIMD drawbacks:

  • SIMD operations cannot be used to process multiple data in different ways.
  • SIMD operations can only be applied to predefined processing patterns with independent data handling.

Normalization: C++ scalar implementation

Screen Shot 2020-11-04 at 2.47.45 PM

Normalization: C++ SIMD SSE implementation

Screen Shot 2020-11-04 at 2.48.09 PM

Server-Side Speedup Results

There are different types of speedup for different algorithms:

  • Normalization is 9 times faster than the scalar C++ version
  • Haar Wavelets is 6 times faster than the scalar C++ version
  • Huffman has no performance increase (not vectorizable algorithm)

Overall, the server-side compression performance improvement is around 3 times faster than the Java version. This is applying SIMD C++ code. This was good for us, so we decided to move on to the client-side speedup.

Client-Side Speedup

For the client-side speedup, we implemented decompression algorithms in C++ and used WASM to integrate the C++ code in JavaScript.

WebAssembly

WASM is:

  • A binary executable format that can run in browsers
  • A low-level virtual machine
  • A high-level language compile result

WASM is not: 

  • A programming language
  • Connected to the web and cannot be run outside the web

Steps to get WASM working:

Screen Shot 2020-11-04 at 2.48.28 PM

  • Compile C/C++ code with Emscripten to obtain a WASM binary
  • Bind WASM binary to the page using a JavaScript “glue code” 
  • Run app and let the browser instantiate the WASM module, the memory, and the table of references. Once that is done, the WebApp is fully operative. 

C++ Code to Integrate (TaperFilter.h/cpp)

Screen Shot 2020-11-04 at 2.48.55 PM

Emscripten Bindings 

Screen Shot 2020-11-04 at 2.49.12 PM

WebAssembly Integration Example

Screen Shot 2020-11-04 at 2.51.30 PM

Client-Side Speedup Takeaways:

  • Emscripten supports the WebAssembly SIMD proposal
  • Vectorized code will be executed by browsers
  • The results of vectorization for decompression algorithm are: 
    • Inv Normalization: 6 times speedup
    • Inv Haar Wavelets: 10 times speedup
    • Inv Huffman: no performance improvement (not vectorizable)

Overall, the client-side decompression performance improvement with vectorized C++ code was around 6 times faster than the JavaScript version.

For more information on GeoToolkit, please visit int.com/geotoolkit/ or check out our webinar, “How to Get the Best Performance of Your Seismic Web Applications.”


Filed Under: IVAAP Tagged With: compression, ivaap, java, javascript, seismic

  • « Go to Previous Page
  • Go to page 1
  • Go to page 2
  • Go to page 3
  • Go to page 4
  • Go to page 5
  • Go to page 6
  • Interim pages omitted …
  • Go to page 9
  • Go to Next Page »

Footer

Solutions

  • For E&P
  • For OSDU Visualization
  • For Cloud Partners
  • For Machine Learning
  • For CCUS
  • For Geothermal Energy
  • For Wind Energy
  • For Enterprise
  • Tools for Developers
  • Customer Success Stories

Products

  • IVAAP
  • GeoToolkit
  • INTViewer
  • IVAAP Demos
  • GeoToolkit Demos

About

  • News
  • Events
  • Careers
  • Management Team

Resources

  • Blog
  • FAQ

Support

  • JIRA
  • Developer Community

Contact

INT logo
© 1989–2023 Interactive Network Technologies, Inc.
Privacy Policy
  • Careers
  • Contact Us
  • Search

COPYRIGHT © 2023 INTERACTIVE NETWORK TECHNOLOGIES, Inc