# Kissing Detector Detect kissing scenes in a movie using both audio and video features. Project for [Stanford CS231N](http://cs231n.stanford.edu) ## Resources - [Paper](https://arxiv.org/abs/1906.01843) - [Poster](poster.pdf) - [Video](https://www.youtube.com/watch?v=3IIGLupGAkI) ## Running the code Use Python 3.6+ ```bash python3 experiments.py ``` this will run the experiments in `params.py` specified by the `experiments` dictionary. ## Requirements This is a PyTorch project. Look at `requirements.txt` for more details. ## Build dataset The following will build the dataset for training. You need to provide path to video segments. ```python from pipeline import BuildDataset videos_and_labels = [ # (file name in base_path, label) where label is 1 for kissing and 0 for not kissing ('movies_casino_royale_2006_kissing_1.mp4', 1), ('movies_casino_royale_2006_kissing_2.mp4', 1), ('movies_casino_royale_2006_kissing_3.mp4', 1), ('movies_casino_royale_2006_not_1.mp4', 0), ('movies_casino_royale_2006_not_2.mp4', 0), ('movies_casino_royale_2006_not_3.mp4', 0), ('movies_goldeneye_1995_kissing_1.mp4', 1), ('movies_goldeneye_1995_kissing_2.mp4', 1), ('movies_goldeneye_1995_kissing_3.mp4', 1), ('movies_goldeneye_1995_not_1.mp4', 0), ('movies_goldeneye_1995_not_2.mp4', 0), ('movies_goldeneye_1995_not_3.mp4', 0), ] builder = BuildDataset(base_path='path/to/movies', videos_and_labels=videos_and_labels, output_path='/path/to/output', test_size=1 / 3) # set aside 1 / 3 of data for validation builder.build_dataset() ``` ## Detect kissing segments in a given video ```python from segmentor import Segmentor import utils # download model.pkl from https://drive.google.com/file/d/1RlvvdInTXtJikGv_ZbHcKoblCypN1Z0A/view?usp=sharing # or train your own model = utils.unpickle('model.pkl') # pickled PyTorch model s = Segmentor(model, min_frames=10, threshold=0.7) # For YouTube clip Hot Summer Nights - Kiss Scene (Maika Monroe and Timothee Chalamet) # at https://www.youtube.com/watch?v=GG5HmLQ_Fx0 # v=XXX is the YouTube ID, pass that here s.visualize_segments_youtube('GG5HmLQ_Fx0') # alternatively you can provide a path to a local mp4 file s.visualize_segments('path/to/file.mp4') ``` See examples in [examples/detector.ipynb](examples/detector.ipynb). ## References - [Video Classification Using 3D ResNet](https://github.com/kenshohara/video-classification-3d-cnn-pytorch) - [3D ResNets for Action Recognition (CVPR 2018)](https://github.com/kenshohara/3D-ResNets-PyTorch/) - [AudioSet](https://research.google.com/audioset/download.html) - [TensorFlow AudioSet](https://github.com/tensorflow/models/tree/master/research/audioset) - [CS231N Saliency maps and class viz PyTorch code](http://cs231n.github.io/assignments2019/assignment3/) - [Torch VGGish](https://github.com/harritaylor/torchvggish)