cpu

Union and endianness

家住魔仙堡 提交于 2020-01-01 10:02:10
问题 typedef union status { int nri; char cit[2]; }Status; int main() { Status s; s.nri = 1; printf("%d \n",s.nri); printf("%d,%d,\n",s.cit[0],s.cit[1]); } OUTPUT: 1 0,1 I know this output on the second line is depend on the endianess of the CPU. How I can write such in a platform-independant program? Is there any way of checking the endianess of the CPU? 回答1: You can use htonl() and/or ntohl(). htonl() stands for "host to network long", while ntohl() stands for "network to host long". The "host"

What is the purpose of hard disk direct memory access?

风格不统一 提交于 2020-01-01 09:25:58
问题 At first glance it seems like a good idea to let the hard disk write to RAM on its own, without CPU instructions copying data, particularly with the success of asynchronous networking in mind. But the Wikipedia article on Direct Memory Access (DMA) states this: With DMA, the CPU gets freed from this overhead and can do useful tasks during data transfer (though the CPU bus would be partly blocked by DMA). I don't understand how a bus line can be "partly blocked". Presumably memory can be

How to measure cpu time and wall clock time?

允我心安 提交于 2020-01-01 07:01:08
问题 I saw many topics about this, even on stackoverflow, for example: How can I measure CPU time and wall clock time on both Linux/Windows? I want to measure both cpu and wall time. Although person who answered a question in topic I posted recommend using gettimeofday to measure a wall time, I read that its better to use instead clock_gettime . So, I wrote the code below (is it ok, is it really measure a wall time, not cpu time? Im asking, cause I found a webpage: http://nadeausoftware.com

Determine whether memory location is in CPU cache

混江龙づ霸主 提交于 2020-01-01 05:05:07
问题 It is possible for an operating system to determine whether a page of memory is in DRAM or in swap; for example, simply try to access it and if a page fault occurs, it wasn't. However, is the same thing possible with CPU cache? Is there any efficient way to tell whether a given memory location has been loaded into a cache line, or to know when it does so? 回答1: In general, I don't think this is possible. It works for DRAM and the pagefile since that is an OS managed resource, cache is managed

How does the CPU/assembler know the size of the next instruction?

ぃ、小莉子 提交于 2020-01-01 03:24:08
问题 For sake of example, imagine i was building a virtual machine. I have a byte array and a while loop, how do i know how many bytes to read from the byte array for the next instruction to interpret a intel 8086 like instruction? EDIT: (commented) the cpu reads the opcode at the instruction pointer, with 8086 and CISC you have one byte and two byte instructions. How do i know if the next instruction is F or FF? EDIT: Found a ansew myself in this peice of text on http://www.swansontec.com/sintel

What is the relation between CPU utilization and energy consumption?

≯℡__Kan透↙ 提交于 2019-12-31 18:58:10
问题 What is the function that describes the relation between CPU utilization and consumption of energy (electricity/heat wise). I wonder if it's linear/sub-linear/exp etc.. I am writing a program that decreases the CPU utilization/load of other programs and my main concern is how much do I benefit energy wise.. Moreover, my server is mostly being used as a web-server or DB in a data-center (headless). In case the data center need more power for cooling I need to consider that as well.. I also

CPU SIMD vs GPU SIMD?

若如初见. 提交于 2019-12-31 11:42:43
问题 GPU uses the SIMD paradigm, that is, the same portion of code will be executed in parallel, and applied to various elements of a data set. However, CPU also uses SIMD, and provide instruction level parallelism. For example, as far as I know, SSE-like instructions will process data elements with parallelism. While the SIMD paradigm seems to be used differently in GPU and CPU, does GPUs have more SIMD power than CPUs ? In which way the parallel computational capabilities in a CPU are 'weaker'

What's the point of multi-threading on a single core?

自作多情 提交于 2019-12-31 08:51:48
问题 I've been playing with the Linux kernel recently and diving back into the days of OS courses from college. Just like back then, I'm playing around with threads and the like. All this time I had been assuming that threads were automatically running concurrently on multiple cores but I've recently discovered that you actually have to explicitly code for handling multiple cores. So what's the point of multi-threading on a single core? The only example I can think of is from college when writing

Is that true if we can always fill the delay slot there is no need for branch prediction?

末鹿安然 提交于 2019-12-31 02:54:06
问题 I'm looking at the five stages MIPS pipeline (ID,IF,EXE,MEM,WB) in H&P 3rd ed. and it seems to me that the branch decision is resolved at the stage of ID so that while the branch instruction reaches its EXE stage, the second instruction after the branch can be executed correctly (can be fetched). But this leaves us the problem of possibly still wasting the 1st instruction soon after the branch instruction. I also encountered the concept of branch delay slot, which means you want to fill the

Is duplication of state resources considered optimal for hyper-threading?

*爱你&永不变心* 提交于 2019-12-31 02:21:08
问题 This question has an answer that says: Hyper-threading duplicates internal resources to reduce context switch time. Resources can be: Registers, arithmetic unit, cache. Why did CPU designers end up with duplication of state resources for simultaneous multithreading (or hyper-threading on Intel)? Why wouldn't tripling (quadrupling, and so on) those same resources give us three logical cores and, therefore, even faster throughput? Is duplication that researchers arrived at in some sense optimal