Writers' Community!
Home News Business Science & Technology Life Style
Science & Technology Home Internet Gadgets Education Environment
Article Submission
We Need YOUR Articles!
We'll Promote Them for FREE!

Author Login

New Authors
Register Here


Now Serving 5,613 Authors
48,609 Quality Articles
& 5,960 Current Users Online!
Featured Authors
Joel Hendon (4,870)
Sandra E. Graham (2,260)
Robert Melaccio, Sr. (6,428)
Terry Mitchell (2,881)
Mike Fak (6,526)
Walter Rhett (2,655)
David Pekrul (802)
Barbara Clark (479)
Teresa Ortiz (4,920)
Jane Bullard (2,004)
Tex Norman (4,421)
Janice Tracy (148)
David Tanguay (7,680)
Mogama (12,506)

View All Featured Authors
Most Recent
Advantages and Features of Mac Computers

Improvements and innovations of the next generation 15 inch MacBook Pro

Apples Design Concepts

Mac games what Macs can offer gamers

Why the new MacBook is different

MacBook Pro review

iMac User Guide

Crucial Information About Computer Routers

Writing In Information With Customised USB Pens

How To Create Customised USB Flash Drives

Home » Categories » Computers & Networking » Hardware » IRAM A New Pattern in Memory » Printer Friendly

IRAM A New Pattern in Memory

Rated 3 out of 5
No Reader Ratings Available ?
Rate It  /  View Comments  /  View All Articles submitted by Daniel Boorn
Submitted Saturday, June 18, 2005
Daniel Boorn (15)
Daniel Boorn
Log in to become a member of Daniel Boorn's Fan Club!


      Posing from a new era of computing concepts, consumers are overwhelmed with choices on computer architectures. The underlining idea of performance is rapidly changing, creating gaps in efficient design.  Personal connectivity has become a constant factor of daily routines, and commerce has split into multiple playing fields forming a wireless race to an undefined venue.  It is true that multi-tasking is no longer tied down to our personal or office pc’s but to pocket size mobile devices that stream audio and video to our delight. Consumed by multiple choices we typically do not know the reality behind the constraints of devolving such computing devices and architectures.  The goal of this paper is not only to open the eyes of the typical consumer, but to inform them of the complex constraints and issues faced by current computing demands.  Concerned mostly with mobile computing architecture, we will examine the current concept of Intelligent RAM or IRAM for short, witch offers a new concept of chip-design for energy efficiency and performance.

 

In achieving a full understanding for the need of the new creative architectures of microprocessors, we must first fully understand different kinds of computers and there every day usage. In example, one would not own a bus pass and drive a car to work every day, similarly one would not buy a server to use as a personal computer. By understanding the different needs of the consumer and looking at the current types of computing, a prediction of the need for future computer designs can be determined. By observing the modern devises out today we can see that there are many types of computing needs.

 

Desktop

 

From the original personal computer to the high performance machines found in most homes today, personal computers are used for every day routines.  Checking your email, online commerce, paying bills, to righting research papers like this one. It's clear that we as the customers expect our home machine to be able to do a number of different tasks. But do we need to have the fasted machine, and what makes a machine fast? For desktop applications the use of the billion-transistor chip is not always necessary.  Multi-tasking and graphic performance is more the concern than floating point performance. Meaning that the average desktop user doesn't need to be able to do an extreme amount of floating point operations at one time, leaving the multi-threaded architectures in the lead for this type of computer.  These chips deliver the best performance do to there advance out-of-order and advanced prediction techniques.

 

Server

 

Online commerce has become a complex multi-user processing, or parallels processing. A severer has to be able to hand a number of floating point operations at one time.  For this task the use of processors with parallism and high memory bandwidth. 

 

When we look at computer architecture we want performance based on user environment. But when you get down to it the chip must be able to use the software that is not extremely complex.  In reality simplicity built on simplicity is the goal of good design. 

 

Mobile devices

 

Mobile devises have evolved in the past 15 years, taking multiple devices that perform a single task to a devise that perform multiple tasks. Modern cell phones now have operating systems that have, video, imaging, sound, speech recognition, gaming, store large amount data, and perform arithmetic operations.  As each year passes the advances of mobile devises improve.  What do we really demand out of a mobile device?  It is from this question that we can determine constants that haven't changed since the creation of mobile devices size, battery life, and performance.  Size must be small, battery life must be the same or out perform the previous, and performance must improve as with the different applications in one device. 

The demand to constantly network is becoming an increasing demand.  Every day routines are becoming easier due to the constant demand to be able to perform computing tasks anywhere.  These routines our not just for communication but for commerce as well. The idea that faster is better is changing to connectivity and multi-tasking on demand is optimal. Using this reasoning computer architects are rethinking the modern chip design. A chip design for a desktop and a server is mainly concerned on processing large amounts of data at the same time. But in reality this in not necessary for mobile devices since real time connectivity is not large but small and fast. The typical real-time video would only be 8 to 16 bits long per pixel and would only need to be processed and displayed to the screen.  Compared to desktop and server environment, that has 32 to 64 bits of information that is processed and stored to memory.  Leading to the question again of performance, this question is taking on two different answers.  Each answer is pending on the type of computing. For mobile computing real-time process is more of an issue of reliability.  The ability to have Coarse-grained parallelism and small data cashes is the goal, forcing chip architecture to be redesigned. The demand for increasing bandwidth is in direct conflict with the battery size and life. 

Wireless devises must perform all operations with little than a 2 watt budget, compared to most high-performance processors with a budget of tens of watts.  This creates a demand for two things, more powerful small batteries, as well as a processor that consumes very little power.  Battery development has come a far way in the past but it is not cheap.  Neither is the development of a chip that consumes hardly any power.  In the near future it will be a combination of a cost efficient battery, with a powerful low powered processor. To fully under stand the effects of implementing a lower powered processor with out having to give up functionality.  We must take into effect the size of the memory on the devise and the size of the executable able to run on the chip.  Typically the higher number of tasks available on a devise the larger the software package is. Larger software means larger memory, bringing the issue of complexity into play. If the complexity on a chip is large, the executable file will be as well.

 

From the different types of computing devices the demand for a chip that compromises in multiple ways is growing. Performance should not be based on cashes bandwidth, size, or processor speed, but on bandwidth between the memory and the processor.  Since your machine is only as fast as the bus between the processor and the memory. The right solution would be to decrease the distance between them as much as possible. The closer the distance, the more region of the chip will be accessible in a single clock cycle. Another concern is time needed to communicate an operation. Since most designs for billion-transistor chips spend a great deal of time on multiple cycles "communication among multiple cores or tiles."   Taking all concerns in hand IEEE purposed a new type of chip design call the VECTOR IRAM.

 

VECTOR IRAM

 

Imagine a chip that combines memory and logic.  This would not only provide a solution to communication speed but to power consumption.

 

"The vector IRAM processor consists of an in-order, dual-issue superscalar processor with first-level caches, tightly integrated with a vector execution unit that contains eight pipelines. Each pipeline can support parallel operations on multiple media types and DSP functions. The memory system consists of 96 Mbytes of DRAM used as main memory. It is organized in a hierarchical fashion with 16 banks and eight sub-banks per bank, connected to the scalar and vector unit through a crossbar. This memory system provides sufficient sequential and random bandwidth even for demanding applications." 1

 

The base of this processor is the vector design.  Vector compilers are not as useful for performing large integer processes but are very competitive with floating point operations as well as graphic performance.  For years there have been many compliers that are vector based in the commercial environment. Using the vector design will keep the complexity for increasing allowing module design.  The main concern for the designers is the cost of implementing and testing the IRAM, since the combination of memory and logic.  To fully understand the combination of memory and logic on one chip we must look into the type of memory used (DRAM) how it is produced and the most cost efficient way of implementing it on the chip.

 

DRAM

 

Producing chips and memory have been two separate things in till now.  The productions of both items involve many factors and tests.  Hence combining the two would not provide a short term cost effective chip since the testing of the memory and the logic would have to be done, e.g. redundant logic testing.  But this is the price that must be paid to develop a chip of this type.  A few more disadvantages are that the amount of memory on the chip might not be enough to store an application meaning that the chip must access conventional memory, defining the purpose of the chip.  Drams refresh rate increases pending operating temperature.  If we were to bypass the idea of these constraints and just focus on the future, we would have to wonder the effects of the IRAM on commerce.

Since the IRAM is considered a microprocessor it will be competing with companies that have been in the microprocessor market for years.  Even more freighting is that the main producers of the IRAM will be memory companies that do not have any experience in that market.  The memory market does not have the concern of software. On the production scale the microprocessors made a year are not even close to the billion of DRAMS, “the key is finding a design that exploits the memory bandwidth potential of IRAM while leveraging software developed for traditional computing."

 

            Implementing parallelism into the instruction set architecture decreases the size of the software that is implemented, as well as spending less wattage on high clock speed.  Numerous testing was done on IRAM and many of its competitors for performance under different benchmarks.  Among these benchmarks were SPMV (complex matrixes), Histogram (histogram performance for an pixels in a 500x500 image with a base 2 depth), Mesh Adaptation (a mesh with very large triangular, vertices, edges).  The overall performance of the IRAM won over its competitors, not only in performance but in using less watts/mops.  

 

            As covered above in the mobile devise section, the demand for this type of computing is vastly increasing.  Adaptation of low energy consumption memory is not only the goal but near reality. Today’s microprocessors tend to be energy hogs in memory loads and stores.  IRAM provides a simple solution to this with placing memory and a microprocessor on a single die, eliminating the need for large on-chip cache-based memory.  The architecture implements DRAM memory instead of traditional SRAM memory, this is due to the high density of DRAM allowing more on-chip memory.  The goal of IRAM architecture is to cut down external memory accesses to a bare minimum. 

This goal gets its motivation from the fact that the performance rate of processors increases at a rate of over 6 times faster than memory per year due to slow off-chip buses, also due to the vast increase in energy efficiency of single chip processors with large on-chip memory. 

 

 

 

Today there are many high performance machines that run below there peak arithmetic performance do to lack of memory bandwidth. The IRAM architecture provides a solution to this problem with the combination of DRAM and the microprocessor in a single chip.  In fact the peak memory bandwidth for the IRAM is 6.4 GB per second, witch is 5-10 times higher than most machines.  Due to its explicit parallelism in vector instructions higher arithmetic performance is achieved. 

 

            The IRAM chip uses the MIPS architecture having 32 64-bit vector registers, each register can be subdivided to operate on 8, 16, 32, and 64-bit data types. By allowing the subdivision of registers the number of elements stored can be greatly increased, thus increasing arithmetic performance.

 

            Address generation is very important for any microprocessor.  Memory addresses have to be checked for validation and collision before being processed. The IRAM architecture allows for 4 64-bits lanes per clock cycle, only having 4 lanes limits the performance.  Performance is increased at 4 32-bits lanes out performing Pentium 4 chips, but peak performance reached at this level. 

 

            There are many ways of implementing memory and processors, first is the traditional way, placing large and expensive SRAM memory with a CPU in a logic process. Second is the IRAM architecture by placing CUP logic with memory.  The second choice is the optimal choice in energy and cost. By eliminating DRAM on-chip we can eliminate the need for second-level on-chip caches. First level-caches are desirable for performance. By leaving a small on-chip cache and implementing a DRAM array we can assume that mostly all memory tasks can be performed on-chip with out external memory accesses.  

 

            The IRAM project promises many great achievements that will provide simple solutions to on going demand for performance networking.  Implementing these ideas as you can see is the first stage, and producing the finished product in a cost effective way is the second. One thing is a fact the new and creative design idea of combining memory and a microprocessor on a singe chip is truly not go away.  Personal connectivity has become a constant factor of daily routines, and commerce has split into multiple playing fields forming a wireless race to an undefined venue.  This venue will shape the future of computer architectures.

 

For more info please go to http://dboorn.com - Daniel Boorn






Reprint Rights

Log in to become a member of Daniel Boorn's Fan Club!

Comments on this article:


» left by Bob from North East (2 years 200 days ago.)
Reader Rating: 1 out of 5
I would trust a thing this guy says - If it is true he ripped it off - He is a crook and a thief - For examples do a google search for Daniel Boorn or Daniel Boorn Myrtle Beach and read teh comments of what he tries to sell
Respond to this comment

Was this article helpful to you? Leave a Public Comment or Question:

 

This Article has been viewed 402 times.
Article added to SearchWarp.com on Saturday, June 18, 2005
View other articles written by Daniel Boorn (15)


If you found this article interesting, you may want to check out:

Disclaimer:  All information on this site is provided for informational purposes only! By no means is any information presented herein intended to substitute for the advice provided to you by any health care or other professional or organization.


Today's Most Popular
There Are Three Basic Types Of Computer Mouse

Notebook Display: WXGA WSXGA or WUXGA?

Stop Computer-killing Dust in its Tracks

The 4 Elements that makes up a Personal Computer System.

HDMI Cables and Signal Loss

ASUS Striker Extreme - CPU INIT Error

Laptop Hinge Repair

PC to TV Conversion - Put Your Desktop On a TV

A Little Vintage Computer Monitor History

How To Choose The Best PC Gaming Headset

Home  |  Page Two  |  FAQ's  |  Contact  |  Terms of Service  |  Article Submission Guidelines  |  Writers' Contests  |  Privacy  |  Mission / About
Copyright © 1999-2008 SearchWarp.com, All Rights Reserved - SearchWarp.com is an IcoLogic, Inc. Company