We present a novel switch design of a large scale multicast packet switch which is features a modular switch architecture and a distributed resource allocation algorithm. The switch inputs and outputs are grouped into small modules called input shared blocks (ISBs) and output shared blocks (OSBs). Input link sharing and output link sharing are cooperated intelligently so that no speedup is necessary in the central switch fabric (ATMCSF). Cell delivery is based on link reservation in every ISB. We propose a dual round robin dynamic link reservation (DRRDLR) algorithm to achieve a fast and fair link resource allocation among ISBs. DRRDLR is a distributed algorithm in a way that an ISB can dynamically increase/decrease its link reservation for a specific OSB according to its local available information. The arbitration complexity is O(1). The switch performance is evaluated through simulations for an 256/spl times/256 switch. It is demonstrated that the proposed switch can achieve a comparable performance to the output queued switch under any traffic pattern. Moreover, our switch design eliminates the N times speedup needed in the OQ switch.