We propose a hierarchical method for human detection and activity recognition in MPEG sequences. The algorithm consists of three stages at different resolution levels. The first step is based on the principal component analysis of MPEG motion vectors of macroblocks grouped according to velocity, distance and human body proportions. This step reduces the complexity and amount of processing data. The DC DCT components of luminance and chrominance are the input for the second step, to be matched to activity templates and a human skin template. A more detailed analysis of the uncompressed regions extracted in previous steps is done at the last step via model-based segmentation and graph matching. This hierarchical scheme enables working at different levels, from low complexity to low false rates. It is important and interesting to realize that significant information can be obtained from the compressed domain in order to connect to high level semantics.