Berkan Solmaz Web

Research Projects

IARPA Aladdin Project (Classification of complex events in user-generated videos)

The goal of this program is to extract information from diverse video clips and to understand the content for classification of these data. For this project, we presented a new global motion and scene descriptor to classify realistic videos of different actions. Our method, bypasses the background subtraction and tracking, the detection of interest points, the extraction of local video descriptors and the quantization of descriptors into a code book; it represents each video sequence as a single feature vector which is computed by applying a bank of 3-D spatiotemporal filters on the frequency spectrum of a video sequence. (DETAILS)

Army Research Office Crowds Project (Identifying Behaviors in Crowds)

In this project, a novel approach, based on the optical flow in a video sequence, is proposed for identifying five specific crowd behaviors in visual scenes. The scene is overlaid by a grid of particles, initializing a dynamical system, which is derived from the optical flow. Numerical integration of the system provides particle trajectories that represent the motion in the scene. Linearization of the dynamic system allows a simple and practical analysis of the behavior through the Jacobian matrix. Essentially, the eigenvalues of this matrix are used to determine the dynamic stability of points in the flow, and each type of stability corresponds to a crowd behavior. (DETAILS)

Harris Project (Geometry Based Human Detection in UAV Imagery)

Detecting humans in imagery taken from a UAV is a challenging problem due to small number of pixels on target, which makes it more difficult to distinguish people from background clutter, and results in much larger search space. In this project, we proposed a method for human detection based on a number of geometric constraints obtained from the metadata we obtain the orientation of ground plane normal, the orientation of shadows cast by humans in the scene, and the relationship between human heights and the size of their corresponding shadows. We utilize the above information in a geometry based shadow, and human blob detector, which provides an initial estimation for locations of humans in the scene. These candidate locations are then classified as either human or clutter using a combination of wavelet features, and a SVM. Our method works on a single frame, and unlike motion detection based methods, it bypasses the global motion compensation process, and allows for detection of stationary and slow moving humans, while avoiding the search across the entire image, which makes it more accurate and very fast. We show impressive results on sequences from the VIVID dataset and our own data, and provide comparative analysis. (DETAILS)

NIH Project (Brain Tumor Segmentation in Multi-parametric MRI)

Enhancing brain tumor segmentation for accurate tumor volume measurement is a challenging task due to the large variation of tumor appearance and shape, which makes it difficult to incorporate prior knowledge commonly used by other medical image segmentation tasks. In this project, a novel idea of confidence surface is proposed to guide the segmentation of enhancing brain tumor using information across multi-parametric magnetic resonance imaging (MRI). Texture information along with the typical intensity information from MRI images are used to train a discriminative classifier at pixel level. The classifier is used to generate a confidence surface, which gives a likelihood of each pixel being a tumor or non-tumor. The obtained confidence surface is then incorporated into two classical methods for segmentation guidance. (PDF)

ADHD Project (Classification of ADHD using network features on f-MRI)

Attention Deficit Hyperactivity Disorder (ADHD) is receiving a lot of attention nowadays mainly because because it is one of the common brain disorders among children. The main goal of this project was the automatic classification of ADHD-diagnosed and normal conditioned brains by the use of functional Magnetic Resonance Images. For this purpose, we used a Bag-of-Words approach is to represent each subject as a histogram of network features; the number of degrees per voxel. We also investigated the use of raw intensity values in the time series for each voxel. Here, every subject is represented as a combined histogram of network and raw intensity features. We performed classification using a SVM on a highly challenging dataset released by NITRC for ADHD-200 competition. (PDF)

BERKAN SOLMAZ PhD, Electrical Engineering - Computer Vision SBIA, University of Pennsylvania