UP | HOME

AI Frameworks

Table of Contents

1 TMP

tensorflow guide: https://www.tensorflow.org/guide/datasets

  • tf.cast

2 Dataset Loading

2.1 Keras

  • download data: tf.keras.utils.get_file(fname, url, extract=False): get file into the cache if not already there, and get placed at ~/.keras/datasets/XXX. The path is returned.
  • preloaded datasets: keras.datasets.mnist.load_data(): this will gives Tuple of Numpy arrays: (x_train, y_train), (x_test, y_test).

2.2 tf.data.Dataset

API doc: https://www.tensorflow.org/api_docs/python/tf/data/Dataset

This is an abstraction class of data that provides several useful APIs and is easy to feed into a model for training.

First, we need to create an instance, using the static method tf.data.Dataset.from_tensor_slices. This is more used than from_tensors(tensors) because the latter results in one element while the former uses the "slice", thus contains the list of tensors directly.

All the APIs are functional: they does not alter the instance, but return a new instance. The useful APIs of this class:

  • map
  • zip
  • shuffle
  • repeat(count=None): by default this will make the dataset repeat forever. When doing the training, you want that.
  • batch(batch_size, drop_remainder=False): the resulting tensor will have one more outer dimension that is the batch size. The size seems to be a placehoder, represented by ?. The batch may not be divided evenly, thus the last batch may have a smaller batch dimension. To prevent this if desired, set drop_remainder to true.
  • prefetch(buffer_size=AUTOTUNE): `prefetch` lets the dataset fetch batches, in the background while the model is training.
  • take: useful for visualization

An extended example:

all_image_paths = ['/path/to/img1', '/path/to/img2', ...]
path_ds = tf.data.Dataset.from_tensor_slices(all_image_paths)
image_ds = path_ds.map(load_and_preprocess_image)

all_image_labels = ['tulips', 'daisy', 'sunflowers', ...]
label_ds = tf.data.Dataset.from_tensor_slices(tf.cast(all_image_labels, tf.int64))

image_label_ds = tf.data.Dataset.zip((image_ds, label_ds))

def load_and_preprocess_from_path_label(path, label):
  return load_and_preprocess_image(path), label
image_label_ds = ds.map(load_and_preprocess_from_path_label)

ds = tf.data.Dataset.from_tensor_slices((all_image_paths, all_image_labels))
ds = image_label_ds.shuffle(buffer_size=image_count)
ds = ds.repeat()
ds = ds.batch(BATCH_SIZE)
ds = ds.prefetch()

2.2.1 TFRecord

Create TFRecord

tf.data.experimental.TFRecordWriter

tfrec = tf.data.experimental.TFRecordWriter('images.tfrec')
tfrec.write(image_ds)

Reading TFRecord (tf.data.TFRecordDataset):

image_ds = tf.data.TFRecordDataset('images.tfrec').map(preprocess_image)

Seems that TFRecordDataset is simply a subclass of Dataset, and only the __init__ function is different.

2.2.2 TFRecord Creation

TFRecord is closely related to ProtocolBuffer.

def _bytes_feature(value):
  """Returns a bytes_list from a string / byte."""
  return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

def _float_feature(value):
  """Returns a float_list from a float / double."""
  return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))

def _int64_feature(value):
  """Returns an int64_list from a bool / enum / int / uint."""
  return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
def convert_to_proto(data):
    # data format example: ('0x87', ['hello', 'world'], [1.2, 0.8])
    feature = {
        'id': _bytes_feature(data[0].encode('utf-8')),
        'mutate': _bytes_feature(tf.serialize_tensor(data[1]).numpy()),
        'label': _bytes_feature(tf.serialize_tensor(data[2]).numpy())
    }
    exp = tf.train.Example(features=tf.train.Features(feature=feature))
    return exp

def decode(raw):
    feature_description = {
        'id': tf.FixedLenFeature([], tf.string, default_value=''),
        'mutate': tf.FixedLenFeature([], tf.string, default_value=''),
        'label': tf.FixedLenFeature([], tf.string, default_value='')
    }
    def _my_parse_function(pto):
      # Parse the input tf.Example proto using the dictionary above.
      return tf.parse_single_example(pto, feature_description)
    decoded = raw.map(_my_parse_function)
    # need to further parse tensor
    # FIXME can we define this inside protobuffer?
    def _my_parse2(pto):
        return {'id': pto['id'],
                'label': tf.parse_tensor(pto['label'], out_type=tf.float32),
                'mutate': tf.parse_tensor(pto['mutate'], out_type=tf.string)}
    return decoded.map(_my_parse2)
    
def __test():
    exp1 = convert_to_proto(('0x87', ['hello', 'world'], [1.2, 0.8]))
    exp2 = convert_to_proto(('0x333', ['yes', 'and', 'no'], [-1, 0.33]))
    exp1
    with tf.python_io.TFRecordWriter('111.tfrec') as writer:
        writer.write(exp1.SerializeToString())
        writer.write(exp2.SerializeToString())
    raw = tf.data.TFRecordDataset('111.tfrec')
    out = decode(raw)
    out
    for a in out:
        print(a)

3 Visualization

Display an image file on disk:

import IPython.display as display
image = display.Image(image_path)
display.display(image)

Read a image file on disk into tensor:

def load_and_preprocess_image(img_path)
    img_raw = tf.read_file(img_path)
    img_tensor = tf.image.decode_image(img_raw)
    print(img_tensor.shape)         # (212, 320, 3)
    print(img_tensor.dtype)         # <dtype: 'uint8'>
    img_final = tf.image.resize_images(img_tensor, [192, 192])
    img_final = img_final/255.0
    print(img_final.shape)          # (192, 192, 3)
    print(img_final.numpy().min())  # 0.0
    print(img_final.numpy().max())  # 1.0
    return img_final

Display image data by matplotlib:

import matplotlib.pyplot as plt
plt.imshow(image_data)
plt.grid(False)
plt.xlabel('some text')
plt.title('title')
print()

Matplotlib:

import matplotlib.pyplot as plt
plt.figure(figsize=(8,8))
for n,image in enumerate(image_ds.take(4)):
  plt.subplot(2,2,n+1)
  plt.imshow(image)
  plt.grid(False)
  plt.xticks([])
  plt.yticks([])
  plt.xlabel(caption_image(all_image_paths[n]))