hyperthreading

Determining the independent CPU's (specified with affinity ID's) for building ATLAS

天涯浪子 提交于 2019-12-06 02:30:20
I'm trying to determine the independent CPU's (specified with affinity ID's) for building ATLAS on a linux machine with 4 Intel CPU's with hyperthreading (ubuntu 12.04). The reason I'm doing this is that the ATLAS manual says to use only the physical cores on machines with hyper threading, on how to achieve this it says: "...you can tell ATLAS to use only the real cores if you learn a little about your machine. Unfortunately, ATLAS cannot presently autodetect these features, but if you experiment you can discover which affinity IDs are the separate cores,..." Further on a hint is given on how

Hyper-threading Performance Comparison

纵然是瞬间 提交于 2019-12-05 19:43:01
I have written a project, which uses some basic functions in openssl such as RAND_bytes and des_ecb_encrypt . My computer has i7-2600(4 cores and 8 logic CPU). When I run my project with 4 threads, it will costs 10 seconds. When I run it with 8 threads, it also costs 10 seconds. What I mean is that hyper-threading doesn't give me any performance improvement. In Linux, the experiment result is same. I found here tells me that hyper-threading doesn't give me some improvement in some situations. Also, I found here give me some intuitive results. However, I have tried to write some simple tests

hyperthreading code example

我是研究僧i 提交于 2019-12-05 19:26:00
Is there some sample code that exemplifies Intel's Hyperthreading performance? Is it at all accessible from user space, or does that CPU do all the work transparently for the programmer? This is for C, Linux. Hyperthreading performance depends on many factors and is difficult to estimate. Just to shortly explain Hyperthreading: Each core has more then one register set, but no additional execution units The hyperthreads are scheduled more or less evenly So you only really get additional performance out of hyperthreads if the two threads running on the same core use different execution units and

Matlab limits TBB but not OpenMP

拜拜、爱过 提交于 2019-12-05 16:42:10
I'm only asking this to try to understand what I've spent 24 hours trying to fix. My system: Ubuntu 12.04.2, Matlab R2011a, both of them 64-bit, Intel Xeon processor based on Nehalem. The problem is simply, Matlab allows OpenMP based programs to utilize all CPU cores with hyper-threading enabled but does not allow the same for TBB. When running TBB, I can launch only 4 threads, even when I change the maxNumCompThreads to 8. While with OpenMP I can use all the threads I want. Without Hyper-threading, both TBB and OpenMP utilize all 4 cores of course. I understand Hyper-threading and that its

Enable Intel Hyperthreading in Java

ⅰ亾dé卋堺 提交于 2019-12-05 09:11:38
I have a multithreaded program running on a quad-core Intel i7. When I execute Runtime.getRuntime.availableProcessors() , I get 8, and I know that hyperthreading is available on this CPU. However, when I create threads, my CPU levels are at 100% (i.e. non-zero) for 4 threads, meaning that 4 threads are unused. Is there any way to enable hyperthreading in Java? Hyperthreading is enabled by the fact that all modern JVMs use native threads, thus this is a OS/CPU config setting. However Hyperthreading does not give you extra cores, it permits fine grained timeshare of the four cpus that you have.

OpenMP: don't use hyperthreading cores (half `num_threads()` w/ hyperthreading)

丶灬走出姿态 提交于 2019-12-05 08:58:14
In Is OpenMP (parallel for) in g++ 4.7 not very efficient? 2.5x at 5x CPU , I determined that the performance of my programme varies between 11s and 13s (mostly always above 12s, and sometimes as slow as 13.4s) at around 500% CPU when using the default #pragma omp parallel for , and the OpenMP speed up is only 2.5x at 5x CPU w/ g++-4.7 -O3 -fopenmp , on a 4-core 8-thread Xeon. I tried using schedule(static) num_threads(4) , and noticed that my programme always completes in 11.5s to 11.7s (always below 12s) at about 320% CPU, e.g., runs more consistently, and uses less resources (even if the

Is HyperThreading / SMT a flawed concept?

跟風遠走 提交于 2019-12-04 03:34:06
The primary idea behind HT/SMT was that when one thread stalls, another thread on the same core can co-opt the rest of that core's idle time and run with it, transparently. In 2013 Intel dropped SMT in favor of out-of-order execution for its Silvermont processor cores, as they found this gave better performance. ARM no longer support SMT (for energy reasons). AMD never supported it. In the wild, we still have various processors that support it. From my perspective, if data and algorithms are built to avoid cache misses and subsequent processing stalls at all costs, surely HT is a redundant

Single-CPU programs running on Hyper-Threading-enabled quadcore CPU

て烟熏妆下的殇ゞ 提交于 2019-12-03 16:27:31
问题 I'm a researcher in statistical pattern recognition, and I often run simulations that run for many days. I'm running Ubuntu 12.04 with Linux 3.2.0-24-generic, which, as I understand, supports multicore and hyper-threading. With my Intel Core i7 Sandy Bridge Quadcore with HTT, I often run 4 simulations (programs that take a long time) at the same time. Before I ask my question, here are the things that I already (think I) know. My OS (Ubuntu 12.04) detects 8 CPUs due to hyper-threading. The

OpenMP drastic slowdown for specific thread number

断了今生、忘了曾经 提交于 2019-12-03 14:23:06
I ran an OpenMP program to perform the Jacobi method, and it was working very well, 2 threads performed slightly over 2x 1 thread, and 4 threads 2x faster than 1 thread. I felt everything was working perfectly... until I reached exactly 20, 22, and 24 threads. I kept breaking it down until I had this simple program #include <stdio.h> #include <omp.h> int main(int argc, char *argv[]) { int i, n, maxiter, threads, nsquared, execs = 0; double begin, end; if (argc != 4) { printf("4 args\n"); return 1; } n = atoi(argv[1]); threads = atoi(argv[2]); maxiter = atoi(argv[3]); omp_set_num_threads

Linux find out Hyper-threaded core id

删除回忆录丶 提交于 2019-12-03 09:26:50
问题 I spent this morning trying to find out how to determine which processor id is the hyper-threaded core, but without luck. I wish to find out this information and use set_affinity() to bind a process to hyper-threaded thread or non-hyper-threaded thread to profile its performance. 回答1: I discovered the simply trick to do what I need. cat /sys/devices/system/cpu/cpu0/topology/thread_siblings_list If the first number is equal to the CPU number (0 in this example) then it's a real core, if not it