Data Types#
Altay Sansal
Apr 29, 2024
5 min read
Scalar Type#
Scalar types are used to represent numbers and boolean values in MDIO arrays.
Scalar array data type. |
These numbers can be integers (whole numbers without a decimal
point, like 1, -15, 204), floating-point numbers (numbers with a fractional part,
like 3.14, -0.001, 2.71828) in various 16-64 bit formats like float32
etc.
It is important to choose the right type for the content of the data for type safety,
memory efficiency, performance, and accuracy of the numbers represented. Most scientific
datasets are float16
, float32
, or float64
values. However, there are many good
use cases for integer and complex values as well.
The ScalarType
s MDIO supports can be viewed below with the tabs.
Data Type |
Options |
Example Value |
---|---|---|
|
|
|
Data Type |
Range |
Example Value |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Data Type |
Range |
Example Value |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Data Type |
Range |
Example Value |
---|---|---|
|
|
|
|
|
|
|
|
|
Precision
float16
: 2 decimal placesfloat32
: 7 decimal placesfloat32
: 16 decimal places
Data Type |
Range |
Example Value |
---|---|---|
|
|
|
|
|
|
Ranges are for both real and imaginary parts.
Structured Type#
Structured data type organizes and stores data in a fixed arrangement, allowing memory efficient access and manipulation.
Structured array type with packed fields. |
|
Structured array field with name, format. |
Structured data types are an essential component in handling complex data structures, particularly in specialized domains like seismic data processing for subsurface imaging applications. These data types allow for the organization of heterogeneous data into a single, structured format.
They are designed to be memory-efficient, which is vital for handling large seismic datasets. Structured data types are adaptable, allowing for the addition or modification of fields.
A StructuredType
consists of StructuredField
s.
Fields can be different numeric types, and each represent a specific
attribute of the seismic data, like coordinate, line numbers, and time stamps.
Each StructuredField
must specify a name
and a data format
(format
).
All the structured fields will be packed and there will be no gaps between them.
Examples#
The table below illustrate ScalarType ranges and shows an example each type.
Variable foo
with type float32
.
{
"name": "foo",
"dataType": "float32",
"dimensions": ["x", "y"]
}
Variable bar
with type uint8
.
{
"name": "bar",
"dataType": "uint8",
"dimensions": ["x", "y"]
}
Below are a couple examples of StructuredType with varying lengths.
We can specify a variable named headers
that holds a 32-byte struct with
four int32
values.
{
"name": "headers",
"dataType": {
"fields": [
{ "name": "cdp-x", "format": "int32" },
{ "name": "cdp-y", "format": "int32" },
{ "name": "inline", "format": "int32" },
{ "name": "crossline", "format": "int32" }
]
},
"dimensions": ["inline", "crossline"]
}
This will yield an in-memory or on-disk struct that looks like this (for each element):
←─ 4 ─→ ←─ 4 ─→ ←─ 4 ─→ ←─ 4 ─→ = 16-bytes
┌───────┬───────┬───────┬───────┐
│ int32 ╎ int32 ╎ int32 ╎ int32 │ ⋯ (next sample)
└───────┴───────┴───────┴───────┘
└→ cdp-x └→ cdp-y └→ inline └→crossline
The below example shows mixing different data types.
{
"name": "headers",
"dataType": {
"fields": [
{ "name": "cdp", "format": "uint32" },
{ "name": "offset", "format": "int16" },
{ "name": "cdp-x", "format": "float64" },
{ "name": "cdp-y", "format": "float64" }
]
},
"dimensions": ["inline", "crossline"]
}
This will yield an in-memory or on-disk struct that looks like this (for each element):
←── 4 ──→ ← 2 → ←─── 8 ───→ ←─── 8 ───→ = 24-bytes
┌─────────┬─────┬───────────┬───────────┐
│ int32 ╎int16╎ float64 ╎ float64 │ ⋯ (next sample)
└─────────┴─────┴───────────┴───────────┘
└→ cdp └→ offset └→ cdp-x └→ cdp-y