Chunk Grid Models#

Altay Sansal

Apr 29, 2024

4 min read

The variables in MDIO data model can represent different types of chunk grids. These grids are essential for managing multi-dimensional data arrays efficiently. In this breakdown, we will explore four distinct data models within the MDIO schema, each serving a specific purpose in data handling and organization.

MDIO implements data models following the guidelines of the Zarr v3 spec and ZEPs:

Regular Grid#

The regular grid models are designed to represent a rectangular and regularly paced chunk grid.

RegularChunkGrid

Represents a rectangular and regularly spaced chunk grid.

RegularChunkShape

Represents regular chunk sizes along each dimension.

For 1D array with size = 31, we can divide it into 5 equally sized chunks. Note that the last chunk will be truncated to match the size of the array.

{ "name": "regular", "configuration": { "chunkShape": [7] } }

Using the above schema resulting array chunks will look like this:

 ←─ 7 ─→ ←─ 7 ─→ ←─ 7 ─→ ←─ 7 ─→   3
┌───────┬───────┬───────┬───────┬───┐
└───────┴───────┴───────┴───────┴───┘

For 2D array with shape rows, cols = (7, 17), we can divide it into 9 equally sized chunks.

{ "name": "regular", "configuration": { "chunkShape": [3, 7] } }

Using the above schema, the resulting 2D array chunks will look like below. Note that the rows and columns are conceptual and visually not to scale.

 ←─ 7 ─→ ←─ 7 ─→   3
┌───────┬───────┬───┐
│                   ↑
│                   3                   ↓
├╶╶╶╶╶╶╶┼╶╶╶╶╶╶╶┼╶╶╶┤
│                   ↑
│                   3                   ↓
├╶╶╶╶╶╶╶┼╶╶╶╶╶╶╶┼╶╶╶┤
│                    1
└───────┴───────┴───┘

Rectilinear Grid#

The RectilinearChunkGrid model extends the concept of chunk grids to accommodate rectangular and irregularly spaced chunks. This model is useful in data structures where non-uniform chunk sizes are necessary. RectilinearChunkShape specifies the chunk sizes for each dimension as a list allowing for irregular intervals.

RectilinearChunkGrid

Represents a rectangular and irregularly spaced chunk grid.

RectilinearChunkShape

Represents irregular chunk sizes along each dimension.

Note

It’s important to ensure that the sum of the irregular spacings specified in the chunkShape matches the size of the respective array dimension.

For 1D array with size = 39, we can divide it into 5 irregular sized chunks.

{ "name": "rectilinear", "configuration": { "chunkShape": [[10, 7, 5, 7, 10]] } }

Using the above schema resulting array chunks will look like this:

 ←── 10 ──→ ←─ 7 ─→  5  ←─ 7 ─→ ←── 10 ──→
┌──────────┬───────┬─────┬───────┬──────────┐
└──────────┴───────┴─────┴───────┴──────────┘

For 2D array with shape rows, cols = (7, 25), we can divide it into 12 rectilinear (rectangular bur irregular) chunks. Note that the rows and columns are conceptual and visually not to scale.

{ "name": "rectilinear", "configuration": { "chunkShape": [[3, 1, 3], [10, 5, 7, 3]] } }

 ←── 10 ──→  5  ←─ 7 ─→   3
┌──────────┬─────┬───────┬───┐
│                           ↑
│                           3                           ↓
├╶╶╶╶╶╶╶╶╶╶┼╶╶╶╶╶┼╶╶╶╶╶╶╶┼╶╶╶┤
│                            1
├╶╶╶╶╶╶╶╶╶╶┼╶╶╶╶╶┼╶╶╶╶╶╶╶┼╶╶╶┤
│                           ↑
│                           3                           ↓
└──────────┴─────┴───────┴───┘

Model Reference#

RegularChunkGrid
pydantic model mdio.schemas.chunk_grid.RegularChunkGrid#

Represents a rectangular and regularly spaced chunk grid.

Show JSON schema
{
   "title": "RegularChunkGrid",
   "description": "Represents a rectangular and regularly spaced chunk grid.",
   "type": "object",
   "properties": {
      "name": {
         "default": "regular",
         "description": "The name of the chunk grid.",
         "title": "Name",
         "type": "string"
      },
      "configuration": {
         "allOf": [
            {
               "$ref": "#/$defs/RegularChunkShape"
            }
         ],
         "description": "Configuration of the regular chunk grid."
      }
   },
   "$defs": {
      "RegularChunkShape": {
         "additionalProperties": false,
         "description": "Represents regular chunk sizes along each dimension.",
         "properties": {
            "chunkShape": {
               "description": "Lengths of the chunk along each dimension of the array.",
               "items": {
                  "type": "integer"
               },
               "title": "Chunkshape",
               "type": "array"
            }
         },
         "required": [
            "chunkShape"
         ],
         "title": "RegularChunkShape",
         "type": "object"
      }
   },
   "additionalProperties": false,
   "required": [
      "configuration"
   ]
}

field configuration: RegularChunkShape [Required]#

Configuration of the regular chunk grid.

field name: str = 'regular'#

The name of the chunk grid.


pydantic model mdio.schemas.chunk_grid.RegularChunkShape#

Represents regular chunk sizes along each dimension.

Show JSON schema
{
   "title": "RegularChunkShape",
   "description": "Represents regular chunk sizes along each dimension.",
   "type": "object",
   "properties": {
      "chunkShape": {
         "description": "Lengths of the chunk along each dimension of the array.",
         "items": {
            "type": "integer"
         },
         "title": "Chunkshape",
         "type": "array"
      }
   },
   "additionalProperties": false,
   "required": [
      "chunkShape"
   ]
}

field chunkShape: list[int] [Required]#

Lengths of the chunk along each dimension of the array.

RectilinearChunkGrid
pydantic model mdio.schemas.chunk_grid.RectilinearChunkGrid#

Represents a rectangular and irregularly spaced chunk grid.

Show JSON schema
{
   "title": "RectilinearChunkGrid",
   "description": "Represents a rectangular and irregularly spaced chunk grid.",
   "type": "object",
   "properties": {
      "name": {
         "default": "rectilinear",
         "description": "The name of the chunk grid.",
         "title": "Name",
         "type": "string"
      },
      "configuration": {
         "allOf": [
            {
               "$ref": "#/$defs/RectilinearChunkShape"
            }
         ],
         "description": "Configuration of the irregular chunk grid."
      }
   },
   "$defs": {
      "RectilinearChunkShape": {
         "additionalProperties": false,
         "description": "Represents irregular chunk sizes along each dimension.",
         "properties": {
            "chunkShape": {
               "description": "Lengths of the chunk along each dimension of the array.",
               "items": {
                  "items": {
                     "type": "integer"
                  },
                  "type": "array"
               },
               "title": "Chunkshape",
               "type": "array"
            }
         },
         "required": [
            "chunkShape"
         ],
         "title": "RectilinearChunkShape",
         "type": "object"
      }
   },
   "additionalProperties": false,
   "required": [
      "configuration"
   ]
}

field configuration: RectilinearChunkShape [Required]#

Configuration of the irregular chunk grid.

field name: str = 'rectilinear'#

The name of the chunk grid.


pydantic model mdio.schemas.chunk_grid.RectilinearChunkShape#

Represents irregular chunk sizes along each dimension.

Show JSON schema
{
   "title": "RectilinearChunkShape",
   "description": "Represents irregular chunk sizes along each dimension.",
   "type": "object",
   "properties": {
      "chunkShape": {
         "description": "Lengths of the chunk along each dimension of the array.",
         "items": {
            "items": {
               "type": "integer"
            },
            "type": "array"
         },
         "title": "Chunkshape",
         "type": "array"
      }
   },
   "additionalProperties": false,
   "required": [
      "chunkShape"
   ]
}

field chunkShape: list[list[int]] [Required]#

Lengths of the chunk along each dimension of the array.