• Login
    View Item 
    •   MINDS@UW Home
    • MINDS@UW Madison
    • College of Engineering, University of Wisconsin--Madison
    • Department of Electrical and Computer Engineering
    • Theses--Electrical Engineering
    • View Item
    •   MINDS@UW Home
    • MINDS@UW Madison
    • College of Engineering, University of Wisconsin--Madison
    • Department of Electrical and Computer Engineering
    • Theses--Electrical Engineering
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    MODULAR DESIGN OF HIGH-THROUGHPUT, LOW-LATENCY SORTING UNITS

    Thumbnail
    File(s)
    MS Thesis (919.2Kb)
    Date
    2012-05-20
    Author
    Farmahini-Farahani, Amin
    Department
    Electrical Engineering
    Advisor(s)
    Schulte, Michael
    Metadata
    Show full item record
    Abstract
    High-throughput and low-latency sorting is a key requirement in many applications that deal with large amounts of data. Searching and highenergy physics systems require a considerable number of sorting units. The particle detectors in CERN?s Large Hadron Collider require hundreds of fast sorting units. To provide the performance and flexibility needed in high-energy physics experiments, these sorting units are often implemented using high-end FPGA devices. This thesis presents efficient techniques for designing high-throughput, low-latency sorting units. Our sorting architectures utilize modular design techniques that hierarchically construct large sorting units from smaller building blocks. The sorting units are optimized for situations in which only the M largest numbers from N inputs are needed, since this situation commonly occurs in many applications for scientific computing, data mining, network processing, digital signal processing,and high-energy physics. We utilize our proposed techniques to design parameterized, pipelined, and modular sorting units. A detailed analysis of these sorting units indicates that as the number of inputs increases their resource requirements scale linearly, their latencies scale logarithmically, and their frequencies remain almost constant. When synthesized to a 65-nm TSMC technology, a single pipelined 256-to-4 sorting unit with 19 stages can perform more than 2.7 billion sorts per second with a latency of about 7 ns per sort. When implemented on a Virtex-5 FPGA, the same sorting unit can perform roughly 200 million sorts per second with a latency of about 95 ns per sort. We also propose iterative sorting techniques, in which a small sorting unit is used several times to find the largest values.
    Permanent Link
    http://digital.library.wisc.edu/1793/62359
    Type
    Thesis
    Part of
    • Theses--Electrical Engineering

    Contact Us | Send Feedback
     

     

    Browse

    All of MINDS@UWCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    Contact Us | Send Feedback