Daniel López Azaña

Theme

Social Media

Blog

GNU/Linux, Open Source, Cloud Computing, DevOps and more...

Differences between physical CPU vs logical CPU vs Core vs Thread vs Socket

single-core-hyperthreading-cpu-diagram

When we try to know a computer’s architecture and performance at CPU level using Linux commands like nproc or lscpu , we often find out that we are not able to properly interpret their results because we confuse terms such as physical CPU, logical CPU, virtual CPU, core, thread, socket, etc. If we add concepts like HyperThreading (not to be confused with multithreading)), we are in a situation where we can not be sure how many cores our box has, we don’t understand why commands like htop indicate that we have 8 cpus when we thought we had bought a single quad-core processor, etc. In short, it’s a mess.

htop command output

In order to clarify I will explain all these concepts using a pair of simple diagrams that I hope will help you to easily understand and never again have these doubts.

The origins: single core CPUs and HyperThreading

Before concepts such as multi-core, virtual or logical cpu would exist, back in the days of Pentium processors, most computers mounted on their motherboard a single chip of considerable size that we called microprocessor, processor or simply CPU. Only a few enterprise computers or larger servers that required more processing power could afford to mount 2 or more of these chips on the same board: they were multiprocessor systems. These chips communicated with other motherboard elements through a connector or socket. And the math was simple: so many connectors or sockets had a board, so many CPUs a computer could have at most. If you wanted more processing power, you just had to look for a motherboard with a larger number of these processors or wait for them to evolve to offer higher performance.

But then Intel realized that communications between the different processors of a multiprocessor system were very inefficient since they had to be done through the system bus , which usually worked at much lower speed. This frecuently caused bottlenecks that made impossible to make the most of the computing capacity offered by each CPU.

Single core CPU with HyperThreading diagram

In order to improve this situation the HyperThreading technology was invented. HT is about duplicating some CPU internal components within the same chip, such as registers or first level caches so that information could be shared between two different execution threads without having to go through the system bus with the corresponding bottlenecks and loss of speed issues. This also allowed that if a process had to wait for an interruption, another process could continue to use the CPU without it being stopped.

This way it was possible to speed up several cumputing processes and began to offer processors with a greater overall performance than traditional ones. The operating system was kind of cheated because it was offered 2 virtual or logical cpus (LCPU) instead of single one as it was allowed to execute 2 processes «at the same time». But it is important to remark that it was impossible to yield twice the processing power of a traditional processor, nor was it possible to offer full parallel processing capabilities.

Thus, from the point of view of Linux or any other operating system, a box with 1 single core processor but HT appears before our eyes as having 2 CPUs. But these are 2 logical cpus running within the same single physical cpu.

One more twist: the emergence of multi-core architectures

But as I said in the previous section, although CPUs with hyperthreading offer more processing power, they can not perform as 2 complete and independent processors, so it was decided to go further miniaturizing all processor components and encapsulating them next to others in a single chip. Each of these encapsulated processors was called core , and it allowed to achieve faster communications between them by means of an internal bus sharing the same silicon. From that moment it was no longer necessary to turn to the system bus, much slower.

Quad-core CPU with HyperThreading diagram

Unlike HT tehcnology, now we have multiple completely independent CPUs to all intents and purposes, one per core. Indeed, from a performance point of view it’s better to have a single multicore processor than the equivalent number of single core CPUs on the same board. Of course it would still be better to have 2 dual-core processors than one, but even better would be to have a single quad-core.

At the operating system level, a physical quad-core processor would be shown as a 4 cpu computer. But these would be 4 logical CPUs or non-physical LCPUs. If the processor additionaly offers HyperThreading technology, commands such as htop or nproc would indicate that there are 8 cpus in the system, but would perform lower than 8 cpus from a single octa-core processor without HyperThreading.

1 LCPU = 1 thread

Quad-core CPU with HyperThreading in Windows

Finally, often we’ll find processors featuring 4 threads, 2 threads per core and things like that. This is simply about the number of execution threads or processing jobs that can be run simultaneously, which is the equivalent of the processing capacity offered by a LCPU. If a processor allows 2 threads per core it means that it is HT. Otherwise it’s normal for the number of cores to match threads.

Logical CPU vs Virtual CPU

The virtual CPU term is comparable to logical CPU but it adds a certain nuance: it’s more framed in terms of computing virtualization. It refers to those cpus mapped to virtual machines from the underlying host hardware, wich can be physical or logical cpus, HT or not. Normally 1 logical cpu from host server is mapped to 1 virtual cpu inside virtual machine, so they are almost equivalent terms.

Recommended reading:
How to know how many cores and processors has a Linux box

CPU HyperThreading
Daniel López Azaña

About the author

Daniel López Azaña

Tech entrepreneur and cloud architect with over 20 years of experience transforming infrastructures and automating processes.

Specialist in AI/LLM integration, Rust and Python development, and AWS & GCP architecture. Restless mind, idea generator, and passionate about technological innovation and AI.

Related articles

terraform-and-route53-logos

How to quickly import all records from a Route53 DNS zone into Terraform

The terraform import command allows you to import into HashiCorp Terraform resources that already existed previously in the provider we are working with, in this case AWS. However, it only allows you to import those records one by one, with one run of terraform import at a time. This, apart from being extremely tedious, in some situations becomes impractical. This is the case for the records of a Route53 DNS zone. The task can become unmanageable if we have multiple DNS zones, each one with tens or hundreds of records. In this article I offer you a bash script that will allow you to import in Terraform all the records of a Route53 DNS zone in a matter of seconds or a few minutes.

February 8, 2022
Script to automatically change all gp2 volumes to gp3 with aws-cli

Script to automatically change all gp2 volumes to gp3 with aws-cli

Last December Amazon announced its new EBS gp3 volumes, which offer better performance and a cost saving of 20% compared to those that have been used until now (gp2). Well, after successfully testing these new volumes with multiple clients, I can do nothing but recommend their use, because they are all advantages and in these 2 and a half months that have passed since the announcement I have not noticed any problems or side effects.

February 16, 2021
AWS security groups

How to automatically update all your AWS EC2 security groups when your dynamic IP changes

One of the biggest annoyances when working with AWS and your Internet connection has a dynamic IP is that when it changes, you immediately stop accessing to all servers and services protected by an EC2 security group whose rules only allow traffic to certain specific IP’s instead of allowing open connections to everyone (0.0.0.0.0/0).Certainly the simplest thing to do is always allowing traffic on a given port to everyone, so that even if you have a dynamic IP on your Internet connection you will always be able to continue accessing even if it changes. But opening traffic on a port to everyone is not the right way to proceed from a security point of view, because then any attacker will be able to access that port without restrictions, and that is not what you want.

January 12, 2021

Comments

River~~ December 11, 2018
In common with several other authors, your info stops just where I need to know one more thing. How do I find out which of the four apparent cores in a two core hyperthreaded system are more closely related to one another? Uses would be that if I have two cpu intensive threads I want them on separate real cores, so would want them to be on two apparent cores that are further from each other. Conversely, if each uses less than 50% cpu time, but the two threads talk to one another a lot, I want them on the same real core, ie on two cores that appear "close" to one another. All that is obvious from your excellent diagrams. But then you stop short of telling me where to look in the /proc filesystem to find that out. Please email me if you add a follow up post to show this rather than a comment (as your system will email me automatically if you answer as a comment). I would tactfully suggest that a follow on post would enhance your site ;)
Jeff Brower December 17, 2018
River, exactly. You phrased it precisely, and I've been searching for days with no luck. Have you found a stackoverflow or other page that gives the answer ? If so pls e-mail me at jbrower at signalogic dot com, thanks.
Daniel January 8, 2019
Hi River, thank you very much for your feedback, this is a good question that was out of the scope of this post, but it may be a good idea to write another one to expand the information. You can identify which physical cpu or core correspond to each logical CPU from the output of /proc/cpuinfo. You can parse it like this to get the information you need: egrep "(( id|processo).*:|^ *$)" /proc/cpuinfo You can also use this command as an alternative: cat /proc/cpuinfo |egrep "processor|physical id|core id" | sed 's/^processor/\\nprocessor/g' Both will give you this output: processor : 0 physical id : 0 core id : 0 processor : 1 physical id : 0 core id : 1 processor : 2 physical id : 0 core id : 2 processor : 3 physical id : 0 core id : 3 My example CPU is not HT and each processor has a different core id, but if yours is HT, then you will find out that some processors share the same core id. Once you identify your pairs of processors, you can force your processes to run in specific cores with taskset command: http://xmodulo.com/run-program-process-specific-cpu-cores-linux.html Hope that helps!
Saurabh Singh December 15, 2018
Thank you for this informative article!
Jeff Brower January 9, 2019
Daniel- Daniel thanks for answering River's question. Can you show how to identify processor pairs in htop ? I have run experiments using an HP DL380 with 32 cores (2 CPUs, eight 2x hyperthreaded cores each) doing this: -some number N of highly computationally intensive threads -some number M of "feeder" threads that feed the computation threads Typically I use N between 2 and 8 and M between 20 and 50. From this I can see in htop which cores are physical and logical because if I pin the computation threads to use only physical cores (as specified in /proc/cpuinfo) and then pin the feeder threads to use any cores *except both* the physical and logical cores of the computation threads, that gives by far the best performance. It seems to make sense as the computation threads would not benefit from hyperthreading and in fact any context switching at all for those threads would impact performance. Knowing what are the processor pairs via /proc/cpuinfo is good, being able to visually confirm them in htop would be great. Thanks
Daniel January 9, 2019
You can configure htop to display the processor id in which each process is running. Run htop, press F2, select Columns from Setup list, choose PROCESSOR from Available Columns list, press F5 to add it to Active Columns and F10 to finish. Now you can see a new column with the processor assigned to each process.
Jeff Brower January 9, 2019
Daniel, thanks very much for your fast reply. Yes I know about the Processor column, and the threads also say which processor (using sched_getcpu()). What I'm looking for is a visually intuitive way in htop, when viewing the processor (CPU) task usage bars at the top, to know which bars are physical and which logical. My understanding is that it varies between machine architectures, for example 0-7 might be physical and 8-15 their logical siblings on one 32 core machine, and 0-7 and 16-23 might be paired on another 32-core machine. For us to interpret our user's /proc/cpuinfo file and then translate that to htop is somewhat painstaking.
Daniel January 9, 2019
I'm not sure you can do what you're trying to do. You mean the "Meters" section of htop, right? I think that section only lets you choose from a small group of display options, but you can't customize the color of each CPU or their appearance as a function of other variables. I'm afraid you'll have to contact the htop developers and ask them to add it as a new feature. In fact it's an interesting feature to request. I know it is not what you are looking for, but maybe it helps: if you want to display your system CPU topology in a graphical and more intuitive way you can use the lstopo command from hwloc utils: https://www.open-mpi.org/projects/hwloc/doc/v2.0.3/a00312.php#cli_examples
Jeff Brower January 9, 2019
Daniel, yes the "Meters" section -- I didn't know that's what it's called. Ok I will ask htop guys. Thanks again.
Carol August 23, 2019
Thank you for your clear explanation and diagrams. I would love to subscribe to your blog but, unless I am mistaken, there does not currently seem to be a way to do so.
Daniel January 20, 2020
Thank you Carol. I don't have any newsletter at this moment if that's what you mean, but you can subscribe to RSS feed that it's available.
sarvagya bhardwaj October 10, 2019
1 PCPU = 2 LCPU 1 LCPU = 25 VCPU ( theoritically ) 1 LCPU = 4-12 VCPU ( practiciall) ~ 1 LCPU= 4VCPU (Minimum) 1 PCPU=8 VCPU 32 Core PCPU server can give = 32*4 = 128 VCPU

Submit comment