Spynet la gi

Set up SPyNet according to the image size and model. For optimal performance, resize your image such that width and height are a multiple of 32. You can also specify your favorite model. The present supported modes are fine tuned models

spynet = require('spynet')
easyComputeFlow = spynet.easy_setup()
1(default),
spynet = require('spynet')
easyComputeFlow = spynet.easy_setup()
2,
spynet = require('spynet')
easyComputeFlow = spynet.easy_setup()
3, and base models
spynet = require('spynet')
easyComputeFlow = spynet.easy_setup()
4 and
spynet = require('spynet')
easyComputeFlow = spynet.easy_setup()
5.

spynet = require('spynet')
computeFlow = spynet.setup(512, 384, 'sintelFinal')    -- for 384x512 images

Now you can call computeFlow anytime to estimate optical flow between image pairs.

Computing flow

Load an image pair and stack and normalize it.

im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)

SPyNet works with batches of data on CUDA. So, compute flow using

im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda()
flow = computeFlow(im)

You can also use batch-mode, if your images

spynet = require('spynet')
easyComputeFlow = spynet.easy_setup()
6 are a tensor of size
spynet = require('spynet')
easyComputeFlow = spynet.easy_setup()
7, of batch size B with 6 RGB pair channels. You can directly use:

flow = computeFlow(im)

Training

Training sequentially is faster than training end-to-end since you need to learn small number of parameters at each level. To train a level

spynet = require('spynet')
easyComputeFlow = spynet.easy_setup()
8, we need the trained models at levels
spynet = require('spynet')
easyComputeFlow = spynet.easy_setup()
9 to
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
0. You also initialize the model with a pretrained model at
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
0.

E.g. To train level 3, we need trained models at

im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
2 and
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
3, and we initialize it
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
4.

th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \
-cache checkpoint -data FLYING_CHAIRS_DIR \
-L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \
-retrain models/modelL2_3.t7

End2End SPyNet

The end-to-end version of SPyNet is easily trainable and is available at anuragranj/end2end-spynet.

Optical Flow Utilities

We provide

im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
5 containing various functions to make your life easier with optical flow while using Torch/Lua. You can just copy this file into your project directory and use if off the shelf.

flowX = require 'flowExtensions'

[flow_magnitude] flowX.computeNorm(flow_x, flow_y)

Given

im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
6 and
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
7 of size
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
8 each, evaluate
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
9 of size
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
8.

[flow_angle] flowX.computeAngle(flow_x, flow_y)

Given

im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
6 and
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
7 of size
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
8 each, evaluate
spynet = require('spynet')
computeFlow = spynet.setup(512, 384, 'sintelFinal')    -- for 384x512 images
4 of size
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
8 in degrees.

[rgb] flowX.field2rgb(flow_magnitude, flow_angle, [max], [legend])

Given

im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
9 and
spynet = require('spynet')
computeFlow = spynet.setup(512, 384, 'sintelFinal')    -- for 384x512 images
4 of size
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
8 each, return an image of size
spynet = require('spynet')
computeFlow = spynet.setup(512, 384, 'sintelFinal')    -- for 384x512 images
9 for visualizing optical flow.
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)
0(optional) specifies maximum flow magnitude and
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)
1(optional) is boolean that prints a legend on the image.

[rgb] flowX.xy2rgb(flow_x, flow_y, [max])

Given

im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
6 and
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
7 of size
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
8 each, return an image of size
spynet = require('spynet')
computeFlow = spynet.setup(512, 384, 'sintelFinal')    -- for 384x512 images
9 for visualizing optical flow.
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)
0(optional) specifies maximum flow magnitude.

[flow] flowX.loadFLO(filename)

Reads a

im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)
7 file. Loads
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)
8 and
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)
9 components of optical flow in a 2 channel
im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda()
flow = computeFlow(im)
0 optical flow field. First channel stores
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)
8 component and second channel stores
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)
9 component.

flowX.writeFLO(filename,F)

Write a

im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda()
flow = computeFlow(im)
0 flow field
im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda()
flow = computeFlow(im)
4 containing
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)
8 and
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)
9 components of its flow fields in its first and second channel respectively to
im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda()
flow = computeFlow(im)
7, a
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)
7 file.

[flow] flowX.loadPFM(filename)

Reads a

im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda()
flow = computeFlow(im)
9 file. Loads
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)
8 and
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)
9 components of optical flow in a 2 channel
im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda()
flow = computeFlow(im)
0 optical flow field. First channel stores
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)
8 component and second channel stores
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)
9 component.

[flow_rotated] flowX.rotate(flow, angle)

Rotates

flow = computeFlow(im)
5 of size
im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda()
flow = computeFlow(im)
0 by
flow = computeFlow(im)
7 in radians. Uses nearest-neighbor interpolation to avoid blurring at boundaries.

[flow_scaled] flowX.scale(flow, sc, [opt])

Scales

flow = computeFlow(im)
5 of size
im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda()
flow = computeFlow(im)
0 by
th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \
-cache checkpoint -data FLYING_CHAIRS_DIR \
-L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \
-retrain models/modelL2_3.t7
0 times.
th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \
-cache checkpoint -data FLYING_CHAIRS_DIR \
-L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \
-retrain models/modelL2_3.t7
1(optional) specifies interpolation method,
th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \
-cache checkpoint -data FLYING_CHAIRS_DIR \
-L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \
-retrain models/modelL2_3.t7
2 (default),
th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \
-cache checkpoint -data FLYING_CHAIRS_DIR \
-L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \
-retrain models/modelL2_3.t7
3, and
th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \
-cache checkpoint -data FLYING_CHAIRS_DIR \
-L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \
-retrain models/modelL2_3.t7
4.

[flowBatch_scaled] flowX.scaleBatch(flowBatch, sc)

Scales

th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \
-cache checkpoint -data FLYING_CHAIRS_DIR \
-L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \
-retrain models/modelL2_3.t7
5 of size
th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \
-cache checkpoint -data FLYING_CHAIRS_DIR \
-L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \
-retrain models/modelL2_3.t7
6, a batch of
th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \
-cache checkpoint -data FLYING_CHAIRS_DIR \
-L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \
-retrain models/modelL2_3.t7
7 flow fields by
th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \
-cache checkpoint -data FLYING_CHAIRS_DIR \
-L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \
-retrain models/modelL2_3.t7
0 times. Uses nearest-neighbor interpolation.

Timing Benchmarks

Our timing benchmark is set up on Flying chair dataset. To test it, you need to download

wget http://lmb.informatik.uni-freiburg.de/resources/datasets/FlyingChairs/FlyingChairs.zip

Run the timing benchmark

spynet = require('spynet')
easyComputeFlow = spynet.easy_setup()
0

References

  1. Our warping code is based on qassemoquab/stnbhwd.
  2. The images in
    th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \
    -cache checkpoint -data FLYING_CHAIRS_DIR \
    -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \
    -retrain models/modelL2_3.t7
    9 are from Flying Chairs dataset: Dosovitskiy, Alexey, et al. "Flownet: Learning optical flow with convolutional networks." 2015 IEEE International Conference on Computer Vision (ICCV). IEEE, 2015.
  3. Some parts of
    im1 = image.load('samples/00001_img1.ppm' )
    im2 = image.load('samples/00001_img2.ppm' )
    flow = easyComputeFlow(im1, im2)
    5 are adapted from marcoscoffier/optical-flow with help from fguney.
  4. The unofficial PyTorch implementation is from sniklaus.

License

Free for non-commercial and scientific research purposes. For commercial use, please contact [email protected]. Check LICENSE file for details.

When using this code, please cite

Ranjan, Anurag, and Michael J. Black. "Optical Flow Estimation using a Spatial Pyramid Network." arXiv preprint arXiv:1611.00850 (2016).