Loading ¶

In the vast majority of cases we will not be generating data, but loading it from somewhere else. We use the np.load function to put data stored in a .npy or .npz file into a variable.

Usually you have the NumPy array stored on your computer; however, to make your life easier I host our data sets online so you can download them whenever. In this example, I have protein-contact-maps.npy which stores 500 protein contact maps.

Note

A protein contact map is a graphical representation that illustrates spatial proximity between amino acid residues in a protein structure. It is commonly used in structural bioinformatics to visualize and analyze the interactions and relationships between different parts of a protein.

If you had this file on your computer, you can simply specify the local file path.

np.load("protein-contact-maps.npy")

To get this array automatically in Python, we have to use the urllib.request module to request the data, then convert it into a format NumPy can read. You do not need to know how this works for this course, but I just want you to know what is going on when you see code from me like this in the future.

In [35]:

         
            Copied!
           
         import io
from urllib import request

npy_file_url = "https://github.com/oasci-bc/python/raw/refs/heads/main/docs/files/npy/steamboat-willie.npy"

# Download the .npy file
response = request.urlopen(npy_file_url)
content = response.read()

# Load the .npy file
contact_maps = np.load(io.BytesIO(content))

# Print information from the array.
print(contact_maps.ndim)
print(contact_maps.shape)
print(contact_maps[0][0])

         import io
from urllib import request

npy_file_url = "https://github.com/oasci-bc/python/raw/refs/heads/main/docs/files/npy/steamboat-willie.npy"

# Download the .npy file
response = request.urlopen(npy_file_url)
content = response.read()

# Load the .npy file
contact_maps = np.load(io.BytesIO(content))

# Print information from the array.
print(contact_maps.ndim)
print(contact_maps.shape)
print(contact_maps[0][0])

3
(500, 16, 16)
[ 0.          3.83347011  6.93431997  9.98147011 12.15553474 10.16739845
  9.63760185 13.3811655  14.22265434 12.16785812 11.25229836 13.22015572
 14.96301937 17.38097    19.78103447 22.77887917]

The line from urllib import request imports the request module from the urllib package in Python. Specifically, it imports the request module from the urllib library, which provides functions for opening and reading URLs.