Spynet la gi

Set up SPyNet according to the image size and model. For optimal performance, resize your image such that width and height are a multiple of 32. You can also specify your favorite model. The present supported modes are fine tuned models spynet = require('spynet') easyComputeFlow = spynet.easy_setup()1(default), spynet = require('spynet') easyComputeFlow = spynet.easy_setup()2, spynet = require('spynet') easyComputeFlow = spynet.easy_setup()3, and base models spynet = require('spynet') easyComputeFlow = spynet.easy_setup()4 and spynet = require('spynet') easyComputeFlow = spynet.easy_setup()5.

spynet = require('spynet') computeFlow = spynet.setup(512, 384, 'sintelFinal') -- for 384x512 images

Now you can call computeFlow anytime to estimate optical flow between image pairs.

Computing flow

Load an image pair and stack and normalize it.

im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)

SPyNet works with batches of data on CUDA. So, compute flow using

im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)

You can also use batch-mode, if your images spynet = require('spynet') easyComputeFlow = spynet.easy_setup()6 are a tensor of size spynet = require('spynet') easyComputeFlow = spynet.easy_setup()7, of batch size B with 6 RGB pair channels. You can directly use:

flow = computeFlow(im)

Training

Training sequentially is faster than training end-to-end since you need to learn small number of parameters at each level. To train a level spynet = require('spynet') easyComputeFlow = spynet.easy_setup()8, we need the trained models at levels spynet = require('spynet') easyComputeFlow = spynet.easy_setup()9 to im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)0. You also initialize the model with a pretrained model at im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)0.

E.g. To train level 3, we need trained models at im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)2 and im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)3, and we initialize it im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)4.

th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t7

End2End SPyNet

The end-to-end version of SPyNet is easily trainable and is available at anuragranj/end2end-spynet.

Optical Flow Utilities

We provide im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)5 containing various functions to make your life easier with optical flow while using Torch/Lua. You can just copy this file into your project directory and use if off the shelf.

flowX = require 'flowExtensions'

[flow_magnitude] flowX.computeNorm(flow_x, flow_y)

Given im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)6 and im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)7 of size im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)8 each, evaluate im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)9 of size im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)8.

[flow_angle] flowX.computeAngle(flow_x, flow_y)

Given im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)6 and im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)7 of size im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)8 each, evaluate spynet = require('spynet') computeFlow = spynet.setup(512, 384, 'sintelFinal') -- for 384x512 images4 of size im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)8 in degrees.

[rgb] flowX.field2rgb(flow_magnitude, flow_angle, [max], [legend])

Given im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)9 and spynet = require('spynet') computeFlow = spynet.setup(512, 384, 'sintelFinal') -- for 384x512 images4 of size im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)8 each, return an image of size spynet = require('spynet') computeFlow = spynet.setup(512, 384, 'sintelFinal') -- for 384x512 images9 for visualizing optical flow. im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)0(optional) specifies maximum flow magnitude and im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)1(optional) is boolean that prints a legend on the image.

[rgb] flowX.xy2rgb(flow_x, flow_y, [max])

Given im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)6 and im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)7 of size im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)8 each, return an image of size spynet = require('spynet') computeFlow = spynet.setup(512, 384, 'sintelFinal') -- for 384x512 images9 for visualizing optical flow. im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)0(optional) specifies maximum flow magnitude.

[flow] flowX.loadFLO(filename)

Reads a im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)7 file. Loads im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)8 and im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)9 components of optical flow in a 2 channel im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)0 optical flow field. First channel stores im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)8 component and second channel stores im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)9 component.

flowX.writeFLO(filename,F)

Write a im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)0 flow field im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)4 containing im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)8 and im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)9 components of its flow fields in its first and second channel respectively to im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)7, a im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)7 file.

[flow] flowX.loadPFM(filename)

Reads a im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)9 file. Loads im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)8 and im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)9 components of optical flow in a 2 channel im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)0 optical flow field. First channel stores im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)8 component and second channel stores im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) im = torch.cat(im1, im2, 1) im = spynet.normalize(im)9 component.

[flow_rotated] flowX.rotate(flow, angle)

Rotates flow = computeFlow(im)5 of size im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)0 by flow = computeFlow(im)7 in radians. Uses nearest-neighbor interpolation to avoid blurring at boundaries.

[flow_scaled] flowX.scale(flow, sc, [opt])

Scales flow = computeFlow(im)5 of size im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda() flow = computeFlow(im)0 by th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t70 times. th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t71(optional) specifies interpolation method, th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t72 (default), th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t73, and th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t74.

[flowBatch_scaled] flowX.scaleBatch(flowBatch, sc)

Scales th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t75 of size th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t76, a batch of th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t77 flow fields by th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t70 times. Uses nearest-neighbor interpolation.

Timing Benchmarks

Our timing benchmark is set up on Flying chair dataset. To test it, you need to download

wget //lmb.informatik.uni-freiburg.de/resources/datasets/FlyingChairs/FlyingChairs.zip

Run the timing benchmark

spynet = require('spynet') easyComputeFlow = spynet.easy_setup()0

References

  1. Our warping code is based on qassemoquab/stnbhwd.
  2. The images in th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \ -cache checkpoint -data FLYING_CHAIRS_DIR \ -L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \ -retrain models/modelL2_3.t79 are from Flying Chairs dataset: Dosovitskiy, Alexey, et al. "Flownet: Learning optical flow with convolutional networks." 2015 IEEE International Conference on Computer Vision (ICCV). IEEE, 2015.
  3. Some parts of im1 = image.load('samples/00001_img1.ppm' ) im2 = image.load('samples/00001_img2.ppm' ) flow = easyComputeFlow(im1, im2)5 are adapted from marcoscoffier/optical-flow with help from fguney.
  4. The unofficial PyTorch implementation is from sniklaus.

License

Free for non-commercial and scientific research purposes. For commercial use, please contact ps-license@tue.mpg.de. Check LICENSE file for details.

When using this code, please cite

Ranjan, Anurag, and Michael J. Black. "Optical Flow Estimation using a Spatial Pyramid Network." arXiv preprint arXiv:1611.00850 (2016).

Chủ đề