Message boards : Number crunching : New FERMI GPU, 4x more cores, more memory
Author | Message |
---|---|
![]() ![]() Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
Nvidia has now announced a new architecture with on-chip L1 and L2 cache memory to support more memory intensive applications. FERMI
Would the 768K L2 cache be sufficient to put a dent in the Rosetta memory requirements to run on a GPU? Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
![]() ![]() Send message Joined: 30 Dec 05 Posts: 1755 Credit: 4,690,520 RAC: 0 |
Here's a nice article talking about why protein folding is an important and challenging field of study and how some are using GPUs for such atomic modeling. Probing Biomolecular Machines with Graphics Processors. (be sure to click the settings item on the menu bar to format to a readable size) Add this signature to your EMail: Running Microsoft's "System Idle Process" will never help cure cancer, AIDS nor Alzheimer's. But running Rosetta@home just might! https://boinc.bakerlab.org/rosetta/ |
zpm Send message Joined: 21 Mar 09 Posts: 6 Credit: 349,801 RAC: 0 |
Here's a nice article talking about why protein folding is an important and challenging field of study and how some are using GPUs for such atomic modeling. Drugdiscovery@home is in the progress of trying to get gpu's ati and nvidia working and hopefully Multi-thread like aqua@home has, but it's a slow work in progress with 1 man doing 95% of the work, ageless is working on the ati app.... http://boinc.drugdiscoveryathome.com if your interested in helping us, pm me for invite code. |
![]() ![]() Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
|
![]() ![]() Send message Joined: 2 Jul 06 Posts: 2842 Credit: 2,020,043 RAC: 0 |
Ahhhh.... congratulations to you! |
zpm Send message Joined: 21 Mar 09 Posts: 6 Credit: 349,801 RAC: 0 |
Ahhhh.... college algebra is tough, maybe you and i should hook up and see what/where we are... my quarter just began. ![]() I recommend Secunia PSI: http://secunia.com/vulnerability_scanning/personal/ http://boinc.drugdiscoveryathome.com |
![]() Send message Joined: 16 Jun 08 Posts: 1235 Credit: 14,372,156 RAC: 382 |
Would the 768K L2 cache be sufficient to put a dent in the Rosetta memory requirements to run on a GPU? Not much of a dent, since minirosetta currently requires about 500MB total memory to run on just one processor - a few hundred times as much. If you plan to use all the cores in order to get the maximum speedup, multiply 500MB by the number of cores to get the approximate amount of memory needed on the GPU card without a major, and therefore rather slow, rewrite of the program. It should be easier to start with a version that runs minirosetta on only as many cores as enough memory is available for, though, with one more core used to combine the various data streams. Much less of a speedup, but still some. Also, it looks to me like the compilers to allow the use of languages such as C++ and Fortran to prepare the computer code are likely to only prepare it for cards with the new GT300 series of GPU chips, and not the GPU cards sold in the past. Something to ask Nvidia about, at least. |
![]() ![]() Send message Joined: 16 Oct 05 Posts: 711 Credit: 26,694,507 RAC: 0 |
Would the 768K L2 cache be sufficient to put a dent in the Rosetta memory requirements to run on a GPU? Wouldn't writing to RAM be useful? ![]() |
![]() Send message Joined: 3 Nov 05 Posts: 1833 Credit: 120,031,044 RAC: 7,827 ![]() |
I believe the bakerlab guys had to do some rewriting of rosetta to get it working efficiently on bulewaters (or was it a different machine?), which I would assume meant getting it to run on a single task in parallel rather than having each CPU running a separate simulation (otherwise it wouldn't have made use of the fact that the CPUs could communicate with each other, which I believe was the whole point of using a supercomputer rather than BOINC). If i'm right (long odds!) then I'd guess that'd be the way to go for GPGPU as well as you only need one copy of the protein in RAM then (rather than one per core). I can't begin to imagine where you'd start with getting the cores to work on the same task together though. |
![]() Send message Joined: 16 Jun 08 Posts: 1235 Credit: 14,372,156 RAC: 382 |
Would the 768K L2 cache be sufficient to put a dent in the Rosetta memory requirements to run on a GPU? Yes, but just how useful is being able to write to an extra less than one percent of the amount of memory needed to run minirosetta on one processor without a major rewrite of the program? The total memory I referred to IS RAM, with a significant slowdown if swapping to the hard drive is used instead. Reaching the hard drive typically takes over a hundred times as long as reaching the RAM. Sharing sections of the database that contain the same values regardless of which GPU core uses them is a good second step, although I'd assume that's a rather small fraction of the total amount of memory each processor needs to use. I believe that the Milkyway@home project has found a way to get some GPU acceleration without a major rewrite of a program with high memory per processor requirements - just don't try to get the maximum speedup by using all the GPU cores. Instead, use only as many as there is enough graphics board memory for. Much less speedup, but allows getting at least some with much less work for the project team. Some people might consider rewriting a computer program in a different computer language, even without rearranging its database, to be a major rewrite. However, even though this is required with current versions of the compiler software, it should be less of a major rewrite than both rewriting the program in a different computer language and making a drastic rearrangement of the database at the same time. The new compilers Nvidia is planning to release in the next few weeks should reduce the amount of effort needed for such a rewrite; but Nvidia hasn't made it very clear whether those new compilers will work with the chips they've sold in the past as well as the new GT300, or only the new GT300 series chips. |
mikey![]() Send message Joined: 5 Jan 06 Posts: 1896 Credit: 10,138,586 RAC: 25,558 ![]() |
The new compilers Nvidia is planning to release in the next few weeks should reduce the amount of effort needed for such a rewrite; but Nvidia hasn't made it very clear whether those new compilers will work with the chips they've sold in the past as well as the new GT300, or only the new GT300 series chips. May be time to upgrade?! Actually it may be time to have multiple versions available depending on which version of gpu a user has. That of course may even lead to different types of units being made available depending on the gpu. High level gpu's can crunch all units, lower level ones can only crunch some units. In short keep what they have and add to it, not replace it. Yes that could be a whole lot more work down the road support wise, but should be a bit easier in the short term. And then as time goes and more and better gpu's become available and more popular, the 400's?, then the pre 300 ones could be dropped off. Each project kind of does this already although they do it with the cpu and type of OS, ie Mac, Linux, Windows etc. Microsoft has always said that keeping Windows backwards compatible has always been the sticking point to making Windows all that it can be. Keep making units like they do now, just make the new version and new units just for it. Maybe even make the new units better and more detailed as far as the research end goes, to take advantage of the new cards capabilities. |
Message boards :
Number crunching :
New FERMI GPU, 4x more cores, more memory
©2025 University of Washington
https://www.bakerlab.org