Loading VTK data into Python

jbnaliboff · January 27, 2021, 7:22pm

Hi all,

This topic has come up a number of times in recent conversations, so I thought I would share the general workflow for importing VTK data directly into python (numpy arrays).

Below is a snippet of python code that should largely work out of the box for 3D cartesian models, where all one has to do is provide the full path and name of the .pvtu in a string. Please note that lines in bold are comments, and you would need to add a # symbol before them in python. I excluded them as they produced odd formatting in the post.

Please let me know if you have any comments, questions, or suggestions.

Cheers,
John

Load modules
import numpy np; import vtk as vtk; from vtk.util import numpy_support

Load vtu data (pvtu directs to vtu files)
reader = vtk.vtkXMLPUnstructuredGridReader()
reader.SetFileName('full_path_and_name_of_pvtu_file')
reader.Update()

Example for how to set the file name:
reader.SetFileName('/home/naliboff/solution-00000.pvtu')

Get the coordinates of nodes in the mesh
nodes_vtk_array= reader.GetOutput().GetPoints().GetData()

Convert nodal vtk data to a numpy array
nodes_numpy_array = vtk.util.numpy_support.vtk_to_numpy(nodes_vtk_array)

Extract x, y and z coordinates from numpy array
x,y,z= nodes_numpy_array[:,0] , nodes_numpy_array[:,1] , nodes_numpy_array[:,2]

Determine the number of scalar fields contained in the .pvtu file
number_of_fields = reader.GetOutput().GetPointData().GetNumberOfArrays()

Determine the name of each field and place it in an array.
field_names = []
for i in range(number_of_fields):
field_names.append(reader.GetOutput().GetPointData().GetArrayName(i))

Determine the index of the field strain_rate
idx = field_names.index("strain_rate")

Example: Extract values of strain_rate
field_vtk_array = reader.GetOutput().GetPointData().GetArray(idx)
strain_rate = numpy_support.vtk_to_numpy(field_vtk_array)

chronictectonic · June 2, 2022, 7:55pm

Hi John,

Thanks for this, it has been very helpful for me trying to post-process using the python-based code fatbox, which I learned about at the last APSECT user meeting.

I am still running into an issue however and am not sure where to look for help so if this isn’t the right place, I apologize. Also, I’m not very familiar with python so that might be part of my troubles.

Anyway, I am able to run some of fatbox’s pre-processing steps using the numpy array I generated using this code (including coordinates of points as well as strain rates). But I’m running into issues with what I think is the code using an image format, where the field data (in this case strain rate) are mapped into the coordinate data.

I tried to write a simple python code to convert my array to one with coordinates matching those of the nodal points from my mesh and then mapping the strain rate data to the positions in the array; however, this required too large of an array given the size of my solution space.

I’m wondering if there is a simpler way to accomplish this that you know of.

Thanks,
Stephanie

chronictectonic · June 7, 2022, 2:00pm

I also emailed the developer of fatbox and they suggested changing the resolution and/or attempting to analyze a subset of the solution space. I intend to try this next.

I was able to create the array mentioned previously using my university cluster’s high memory nodes but am still wondering if there is a more efficient path forward.

Thanks,
Stephanie

jbnaliboff · June 21, 2022, 7:24pm

Hi Stephanie,

Thanks for posting and sorry for the delay in responding. I agree with the Fatbox folks that the best approach is to analyze a subset of the region.

Can you send over the following information about the model and your analysis goals?

What is the model size (dimensions, resolution)? Also, what is the storage space (i.e., how large are files) for each time step?
What spatial area do you want to analyze (cross-section, surface, etc)?

You will need to load all the data for each time step into memory initially, but afterwards you can use spatial constraints to analyze a small subset of the region.

For the first issue you described (incorrect mapping of values), we will likely need to see the underlying code you are using to get information into Fatbox and how Fatbox is being used.

Cheers,
John

chronictectonic · July 13, 2022, 6:57pm

Hi John,

Sorry I’m just seeing this (I didn’t get a notification of the reply).

To answer your questions:

I was just testing out fatbox with ASPECT’s crustal deformation cookbook output using a geometry of 80km by 16km and AMR level 1 and global refinement level 3 to get about 400 cells on 5 levels. The files are 15-20 kB each with 16 for each time step.
I’m trying to analyze the two oppositely verging areas of high strain that result from this cookbook that span the 2-D section.

I was able to contour the strain rate values in python using the method you give above:

.

The fatbox developer I messaged confirmed that fatbox indeed uses image analysis tools and I was able to plot an image using the large (mostly sparse) array using the high memory nodes on my university cluster.

Thanks,
Stephanie

jbnaliboff · June 26, 2024, 3:38pm

An update to this discussion in response to a related question in the form post “Using Paraview or Matlab to do calculations with output fields”.

Rather than use the VTK commands directly, I’ve found that using PyVista provides a much more convenient way of loading data into python, and also providing a wide range of useful functionality for manipulating and visualizing the data. PyVista is largely built on VTK, and if preferred you can use both PyVista and direct VTK functionality.

There are now two examples for how to use PyVista for loading and visualizing ASPECT data in the repository, and another workflow example is provide below. At the bottom of the post are some suggestions for how to install pyvista in anaconda.

If you have any further questions on PyVista or related topics, please feel free to post them here!

Example PyVista Workflow

# Import PyVista
import pyvista as pv

# Read solution data into a variable named mesh, which contains all of the solution data. Replace the name of the pvtu with the full or relative path and name of the specific file you wish to load.
mesh = pv.read(‘solution-00000.pvtu’)

# Show all variables (data) contained with the objected mesh.point_data (you can also cell-wise data via mesh.cell_data. The output will include a list of the fields (T, p, etc) contained within the vtu file..
mesh.point_data

# Load point (x,y,z) data into a unique variable (numpy ndarray)
points = mesh.points

# Load temperature data into a unique variable. Note that the same thing can be accomplished with temperature = mesh.point_data['T']
temperature = mesh[‘T’]

PyVista Installation Instructions

While PyVista can be installed through Anaconda or PIP, the most straightforward way to ensure it works and does not produce conflicts with other python libraries is to create an anaconda environment with the package dependencies found at pyvista/environment.yml at main · pyvista/pyvista · GitHub.
After downloading this file, create a new anaconda environment using this file with conda env create -f environment.yml.
Once that environment has been created, activate it with conda activate pyvista-env and then install pyvist with conda install -c conda-forge pyvista.
You can check to see if pyvista is installed with import pyvista as pv.

Topic		Replies	Views
How can I extract data with geographic coordinate system from vtk file? ASPECT	5	182	October 9, 2023
Strange VTK file output, not readable in Paraview SPECFEM seismology	0	373	July 26, 2019
How to access vertices of ASPECT mesh with deformation (e.g. free surface case) ASPECT	8	377	June 28, 2021
Using Paraview or Matlab to do calculations with output fields ASPECT	8	1307	June 26, 2024
Importing fault data question PyLith	5	445	September 21, 2020

Loading VTK data into Python

Related topics