# Kissing Detector Detect kissing scenes in a movie using both audio and video features. Project for [Stanford CS231N](http://cs231n.stanford.edu) ## Build dataset ```python from pipeline import BuildDataset videos_and_labels = [ # (file name in base_path, label) where label is 1 for kissing and 0 for not kissing ('movies_casino_royale_2006_kissing_1.mp4', 1), ('movies_casino_royale_2006_kissing_2.mp4', 1), ('movies_casino_royale_2006_kissing_3.mp4', 1), ('movies_casino_royale_2006_not_1.mp4', 0), ('movies_casino_royale_2006_not_2.mp4', 0), ('movies_casino_royale_2006_not_3.mp4', 0), ('movies_goldeneye_1995_kissing_1.mp4', 1), ('movies_goldeneye_1995_kissing_2.mp4', 1), ('movies_goldeneye_1995_kissing_3.mp4', 1), ('movies_goldeneye_1995_not_1.mp4', 0), ('movies_goldeneye_1995_not_2.mp4', 0), ('movies_goldeneye_1995_not_3.mp4', 0), ] builder = BuildDataset(base_path='path/to/movies', videos_and_labels=videos_and_labels, output_path='/path/to/output', test_size=1 / 3) # set aside 1 / 3 of data for validation builder.build_dataset() ``` ### Data loader ## Explorations: - ConvNet, VGGish, or both - ConvNet architectures: ResNet, VGG, AlexNet, SqueezeNet, DenseNet - With and without pre-training - - (3DC) ## Diagnostics - Saliency maps - Class viz - Confusion matrices - Detected segments - Failure examples ## TODO - Define experiments - Saliency map, class viz - - ...