top of page

Sumador Rápido Simple y Lineal

En una serie de artículos y conferencias, he propuesto una arquitectura de "Sumador Rápido Simple y Lineal" para Unidades Lógicas Aritméticas (ALU, por sus siglas en inglés). El cuello de botella de Von Neumann, responsable en gran parte de la mayoría del retraso y el consumo de energía de un procesador, es evitado utilizando una arquitectura de Computación en Memoria (Compute-In-Memory). Este avance permite un procesamiento más rápido y eficiente en términos de energía, lo cual es crucial para el entrenamiento de redes neuronales e inteligencia artificial (IA), criptografía, modelado científico, investigación matemática, procesamiento digital de imágenes y otras aplicaciones intensivas que requieren ASICs, GPUs y TPUs de alto rendimiento.

​​

Los sumadores rápidos tradicionales aumentan su complejidad y área proporcionalmente al cuadrado del número de bits. Sin embargo, el sumador propuesto tiene una complejidad de circuito constante, una profundidad de compuerta reducida y es escalable linealmente. Además de ofrecer una mejor eficiencia en tiempo y energía, el diseño, la producción y los costos de material también son menores. Este Sumador Rápido Simple y Lineal ofrece una ventaja aún mayor con respecto a otras arquitecturas, ya que permite implementar una arquitectura de Computación-En-Memoria para sumar múltiples entradas y multiplicar dos entradas. De las características más importantes de este circuito es que se puede escalar para lograr Multiplicación Rápida de Matrices In-Situ (Computación-En-Memoria).

​​

​​

La importancia de la Multiplicación de Matrices

​​

La multiplicación de matrices es una operación fundamental en matemáticas, ciencias de la computación e ingeniería, que permite modelar y calcular relaciones complejas entre conjuntos de datos. Su eficiencia impacta directamente el desempeño de numerosas aplicaciones de vanguardia, entre ellas:

  1. Inteligencia Artificial (IA) y Aprendizaje Automático (ML)
    La multiplicación de matrices impulsa operaciones esenciales en redes neuronales, como la aplicación de pesos a entradas durante la propagación hacia adelante y hacia atrás. Afecta directamente la velocidad de entrenamiento y la escalabilidad de los modelos de IA, fundamentales para aplicaciones como procesamiento de lenguaje natural, visión por computadora y sistemas de recomendación.

  2. Gráficos Computacionales y Videojuegos
    Transformaciones como rotación, escalado y traslación en gráficos 3D dependen de operaciones matriciales. La multiplicación de matrices permite renderizar en tiempo real para videojuegos, simulaciones y entornos de realidad virtual o aumentada.

  3. Criptografía y Seguridad
    La multiplicación de matrices es fundamental para muchos algoritmos criptográficos utilizados en el intercambio seguro de claves, el cifrado y el descifrado. Acelerar estas operaciones mejora la eficiencia en la protección de datos sensibles, especialmente en aplicaciones en tiempo real.

  4. Computación Científica y Simulaciones
    En campos como física, química y modelado meteorológico, las operaciones matriciales son cruciales para resolver simulaciones a gran escala y métodos numéricos. La multiplicación de matrices más rápida permite procesar modelos más complejos y precisos en menos tiempo.

  5. Análisis de Datos y Big Data
    Técnicas como el análisis de componentes principales (PCA) y los modelos de aprendizaje automático utilizan la multiplicación de matrices para analizar correlaciones y patrones en conjuntos de datos masivos, generando información clave en sectores como finanzas, salud y marketing.

  6. Procesamiento de Señales
    El procesamiento de señales digitales para datos de audio, imagen y video depende de la multiplicación de matrices para tareas como filtrado, transformaciones y compresión. Esta operación es integral para tecnologías como codificación MP3, compresión de video e imágenes médicas.

  7. Problemas de Optimización
    Desde logística hasta robótica, muchas técnicas de optimización involucran resolver ecuaciones que dependen de operaciones matriciales. Una multiplicación de matrices rápida y eficiente acelera la toma de decisiones y la resolución de problemas en sistemas en tiempo real.

 

El costo computacional de la multiplicación de matrices aumenta rápidamente con el tamaño de las matrices. A medida que crece la demanda de potencia computacional, especialmente en campos como la IA, Big Data y la criptografía, los avances en hardware para la multiplicación de matrices serán esenciales para impulsar la innovación y abordar los desafíos futuros. Innovaciones como la Unidad Aritmética Rápida (FAU, por sus siglas en inglés), que incorpora multiplicación de matrices optimizada a nivel de hardware, son críticas.

Arquitectura In-Situ: Nuevas Fronteras para la Computación

La arquitectura tradicional de Von Neumann separa la memoria del procesamiento, lo que requiere mover datos entre estos componentes. Esto crea un cuello de botella significativo, especialmente en tareas con cargas masivas de operaciones como la multiplicación de matrices. La arquitectura de Computación en Memoria (CIM) elimina este cuello de botella al realizar cálculos directamente dentro de la memoria, ofreciendo varios beneficios transformadores:

  1. Reducción de la Latencia: Al minimizar la transferencia de datos entre la memoria y el procesador, la CIM acelera significativamente las operaciones matriciales.

  2. Eficiencia Energética: Los cálculos realizados dentro de la memoria reducen el consumo de energía, haciéndolo ideal para aplicaciones que requieren un rendimiento sostenido, como centros de datos y entrenamiento de IA.

  3. Escalabilidad: La arquitectura admite el procesamiento paralelo de operaciones matriciales, crucial para tareas de computación de alto rendimiento.

  4. Diseño Compacto: La CIM reduce el tamaño del hardware, permitiendo su integración en dispositivos más pequeños, desde móviles hasta nodos de computación en el borde.

 

Cuando se combina con hardware optimizado como la Unidad Aritmética Rápida (FAU), la arquitectura de computación en memoria amplifica el impacto de la multiplicación de matrices al proporcionar una velocidad y eficiencia computacional inigualables. Esta sinergia es particularmente vital para satisfacer las crecientes demandas de la IA, Big Data y sistemas en tiempo real.

​​​​​​

​​

  1. Propuesta

  2. Reporte de Patentabilidad

  3. Artículos

  4. Conferencias

  5. Enlaces Adicionales
How do Graphics Cards Work?  Exploring GPU Architecture
28:30

How do Graphics Cards Work? Exploring GPU Architecture

Interested in working with Micron to make cutting-edge memory chips? Work at Micron: https://bit.ly/micron-careers Learn more about Micron's Graphic Memory! Explore Here: https://bit.ly/micron-graphic-memory Curious about AI memory and HBM3E? Take a look: https://bit.ly/micron-hbm3e Graphics Cards can run some of the most incredible video games, but how many calculations do they perform every single second? Well, some of the most advanced graphics perform 36 Trillion calculations or more every single second. But how can a single device manage these tens of trillions of calculations? In this video, we explore the architecture inside the 3090 graphics card and the GA102 GPU chip architecture. Note: We chose to feature the 30 series of GPUs because, to create accurate 3D models, we had to tear down a 3090 GPU rather destructively. We typically select a slightly older model because we're able to find broken components on eBay. If you're wondering, the 4090 can perform 82.58 trillion calculations a second, and then we're sure the 5090 will be even more. Table of Contents: 00:00 - How many calculations do Graphics Cards Perform? 02:15 - The Difference between GPUs and CPUs? 04:56 - GPU GA102 Architecture 06:59 - GPU GA102 Manufacturing 08:48 - CUDA Core Design 11:09 - Graphics Cards Components 12:04 - Graphics Memory GDDR6X GDDR7 15:11 - All about Micron 16:51 - Single Instruction Multiple Data Architecture 17:49 - Why GPUs run Video Game Graphics, Object Transformations 20:53 - Thread Architecture 23:31 - Help Branch Education Out! 24:29 - Bitcoin Mining 26:50 - Tensor Cores 27:58 - Outro We're working on more ambitious subjects like computer architecture and graphics cards. Any contribution would greatly help make these videos. https://www.patreon.com/brancheducation Branch Education Website: https://www.branch.education Branch Education Facebook: https://www.facebook.com/BranchEducation/ Key Branches from this video are: How do Video Game Graphics Work? https://youtu.be/C8YtdC8mxTU Animation: Mike Radjabov, Sherdil Davronov, Adrei Dulay, Parvesh Khatri Research, Script and Editing: Teddy Tablante Twitter: @teddytablante Modeling: Mike Radjabov, Prakash Kakadiya Voice Over: Phil Lee Sound Design by Drilu: www.drilu.world Sound Design and mix: David Pinete Additional Sound Design: Raúl Núñez Supervising Sound Editor: Luis Huesca Erratum: 04:50 Ubuntu is a type of Linux 08:45 3080 has 10GB, not 12GB. Image Attribution: Williams, George. Jacquard Loom 6/6 https://www.flickr.com/photos/ghwpix/18056523/in/album-72157631870990316/ Eiermann, Georg. A close up of a weaving machine in a room. Jacquard Loom. https://unsplash.com/photos/a-close-up-of-a-weaving-machine-in-a-room--jvBPiva0vc Wikipedia contributors. "Embarrassingly Parallel", "Graphics Processing Unit", "Parallel Computing" , " SIMD", " Single Instruction, Multiple Threads" , "Thread block (CUDA Programming)". Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, Visited October 18th 2024 Textbooks and Papers A TON of information was found in Nvidia's white papers. You can find them here: https://research.nvidia.com/publications We recommend the GA102 architecture white paper and the fermi architecture white paper. #GPU #GraphicsCard #Micron
How Computers Calculate - the ALU: Crash Course Computer Science #5
11:10

How Computers Calculate - the ALU: Crash Course Computer Science #5

Today we're going to talk about a fundamental part of all modern computers. The thing that basically everything else uses - the Arithmetic and Logic Unit (or the ALU). The ALU may not have to most exciting name, but it is the mathematical brain of a computer and is responsible for all the calculations your computer does! And it's actually not that complicated. So today we're going to use the binary and logic gates we learned in previous episodes to build one from scratch, and then we'll use our newly minted ALU when we construct the heart of a computer, the CPU, in episode 7. *CORRECTION* We got our wires crossed with the Intel 4004, which we discuss later. The 74181 was introduced by Texas Instruments in 1970 but appeared in technical manuals around 1969. The design of the 74181, like most of the 74xx/74xxx series, was an open design which was manufactured by many other companies - Fairchild was one such manufacturer. They produced a chip, the Fairchild 9341, which was pin-for-pin compatible with the 74181. Fairchild was the first to prototype an ALU, building the Fairchild 4711 in 1968 - a one-off device not optimized for scale manufacturing. In 1969, Signetics came out with the 8260, which they marketed in a very limited sense (it was attached, AFAICT, to one particular computer, the Data General SUPERNOVA). TI follows afterwards (March 1970) with the 74181, coupled with the 9341 from Fairchild. The 74181 became the standard number for this part, and was available from many manufacturers (back in those days, chip makers cross-licensed designs all over the place in order to provide assurance that their part could be sourced from multiple manufacturers). Produced in collaboration with PBS Digital Studios: http://youtube.com/pbsdigitalstudios The Latest from PBS Digital Studios: https://www.youtube.com/playlist?list... We’ve got merch! https://store.dftba.com/collections/crashcourse Want to know more about Carrie Anne? https://about.me/carrieannephilbin Want to find Crash Course elsewhere on the internet? Facebook - http://www.facebook.com/YouTubeCrashC... Twitter - http://www.twitter.com/TheCrashCourse Tumblr - http://thecrashcourse.tumblr.com Support Crash Course on Patreon: http://patreon.com/crashcourse CC Kids: http://www.youtube.com/crashcoursekids Want to find Crash Course elsewhere on the internet? Facebook - http://www.facebook.com/YouTubeCrashCourse Twitter - http://www.twitter.com/TheCrashCourse Tumblr - http://thecrashcourse.tumblr.com Support Crash Course on Patreon: http://patreon.com/crashcourse CC Kids: http://www.youtube.com/crashcoursekids
Architecture All Access: Modern CPU Architecture Part 1 – Key Concepts | Intel Technology
18:58

Architecture All Access: Modern CPU Architecture Part 1 – Key Concepts | Intel Technology

What is a CPU, and how did they become what they are today? Boyd Phelps, CVP of Client Engineering at Intel, takes us through the history of CPU architecture, key architecture concepts like computing abstraction layers, Instruction Set Architecture (ISA), and more. Watch part two here: https://youtu.be/o_WXTRS2qTY Boyd Phelps has worked on some of the most well-known chip designs in Intel’s history, from Nehalem to Haswell to Tiger Lake and more. Architecture All Access is a master class technology series featuring Senior Intel Technical Leaders taking an educational approach to the historical impact and future innovations of key architectures that will continue to be at the center of ‘world-changing technology that enriches the lives of every person on earth.’ If you are interested in CPUs, FPGAs, Quantum Computing and beyond, subscribe and hit the bell to get new episode notifications. Chapters: 0:00 CPUs Are Everywhere 0:52 Meet Boyd Phelps, CVP of Client Engineering 1:58 Topics We're Covering 2:32 What Is A CPU? 5:39 CPU Architecture History 6:40 Bug Aside 7:30 Back to CPU History 11:13 Computing Abstraction Layers 14:58 Instruction Set Architecture (ISA) 18:28 What's in Part Two? Subscribe now to Intel Technology on YouTube: https://intel.ly/3P9BA7x About Intel Technology: Intel has always been at the forefront of developing exciting new technology for business and consumers including emerging technologies, data center servers, business transformation, memory and storage, security, and graphics. The Intel Technology YouTube channel is a place to learn tips and tricks, get the latest news, and watch product demos from both Intel and our many partners across multiple fields. Connect with Intel Technology: Visit Intel Technologies WEBSITE: https://intel.ly/IntelTechnologies Follow Intel Technology on TWITTER: https://twitter.com/IntelTech Architecture All Access: Modern CPU Architecture Part 1 – Key Concepts | Intel Technology https://www.youtube.com/inteltechnology
Architecture All Access: Modern CPU Architecture 2 - Microarchitecture Deep Dive | Intel Technology
25:34

Architecture All Access: Modern CPU Architecture 2 - Microarchitecture Deep Dive | Intel Technology

What is a CPU microarchitecture and what are the building blocks inside a CPU? Boyd Phelps, CVP of Client Engineering at Intel, takes us through key microarchitecture concepts like pipelines, speculation, branch prediction as well as the main building blocks in the front and back end of a CPU. Want to learn about the history of CPU architecture? Check out part one here: https://youtu.be/vgPFzblBh7w Boyd Phelps has worked on some of the most well-known chip designs in Intel’s history, from Nehalem to Haswell to Tiger Lake and more. Architecture All Access is a master class technology series featuring Senior Intel Technical Leaders taking an educational approach to the historical impact and future innovations of key architectures that will continue to be at the center of ‘world-changing technology that enriches the lives of every person on earth.’ If you are interested in CPUs, FPGAs, Quantum Computing and beyond, subscribe and hit the bell to get new episode notifications. Chapters: 0:00 Welcome to CPU Architecture Part 2 0:51 Meet Boyd Phelps, CVP of Client Engineering 1:14 What Are We Covering? 1:47 Key Building Blocks in a CPU 3:16 Pipeline Depth 5:15 Speculation 7:06 Branch Prediction 7:35 Speculative Execution 12:48 The Microprocessor Front End: Predict and Fetch 14:44 The Microprocessor Front End: Decode 17:19 Superscalar Execution 18:36 Out-Of-Order 19:57 CPU Back End 23:35 Micro-Architecture Summary 24:39 Where Are We Headed? Subscribe now to Intel Technology on YouTube: https://intel.ly/3P9BA7x About Intel Technology: Intel has always been at the forefront of developing exciting new technology for business and consumers including emerging technologies, data center servers, business transformation, memory and storage, security, and graphics. The Intel Technology YouTube channel is a place to learn tips and tricks, get the latest news, and watch product demos from both Intel and our many partners across multiple fields. Connect with Intel Technology: Visit Intel Technologies WEBSITE: https://intel.ly/IntelTechnologies Follow Intel Technology on TWITTER: https://twitter.com/IntelTech Architecture All Access: Modern CPU Architecture 2 - Microarchitecture Deep Dive | Intel Technology https://www.youtube.com/inteltechnology
How a CPU Works
20:42

How a CPU Works

Learn how the most important component in your device works, right here! Author's Website: http://www.buthowdoitknow.com/ See the Book: http://amzn.to/1mOYJvA (As of 2024-01-15, all videos on this channel are under the CC0 license (very similar to Public Domain). Feel free to download and repost without compensation, attribution, or notice.) https://creativecommons.org/public-domain/cc0/ See scripts for future videos here: https://github.com/In-One-Lesson/VideoScripts See the 6502 CPU Simulation: http://visual6502.org/JSSim/index.html Download the PowerPoint file used to make the video: https://docs.google.com/presentation/d/0BzwHNpicSnW0cGVmX0c3SVZzMFk/edit?usp=sharing&ouid=116531966426337918883&resourcekey=0-N0P5hrS6vx3En8ifQ-shGA&rtpof=true&sd=true The CPU design used in the video is copyrighted by John Scott, author of the book But How Do It Know?. There are a few small differences between the CPU in the video and the one used in the book. Those differences are listed below but they should not detract from your understanding of either. CONTROL UNIT - This component is called the Control Section in the book. It is called Control Unit here simply because that is a more common name for it that you might see used elsewhere. LOAD INSTRUCTION - In this video, what's called a LOAD instruction is actually called a DATA instruction in the book. The Scott CPU uses two different instructions to move data from RAM into the CPU. One loads the very next piece of data (called a DATA instruction in the book) and the other uses another register to tell it which address to pull that data from (called a LOAD instruction in the book). The instruction was renamed in the video for two reasons: 1) It might be confusing to hear that the first type of data we encounter in RAM is itself also called DATA. 2) Since the LOAD instruction from the book is a more complex concept, it was easier to use the DATA instruction in the video to introduce the concept of moving data from RAM to the CPU . IN and OUT INSTRUCTIONS - In the Scott CPU, there is more involved in moving data between the CPU and external devices than just an IN or an OUT instruction. That process was simplified in the video to make the introduction of the concept easier. ACCUMULATOR - The register that holds the output of the ALU is called the Accumulator in the book. That is the name typically used for this register, although it was simply called a register in the video. MEMORY ADDRESS REGISTER - The Memory Address Register is a part of RAM in the book, but it is a part of the CPU in the video. It was placed in the CPU in the video as this is generally where this register resides in real CPUs. JUMP INSTRUCTIONS - In the book there are two types of unconditional JUMP instructions. One jumps to the address stored at the next address in RAM (this is the one used in the video) and the other jumps to an address that has already been stored in a register. These are called JMP and JMPR instructions in the book respectively. MISSING COMPONENT - There is an additional component missing from the CPU in the video that is used to add 1 to the number stored in a register. This component is called "bus 1" in the book and it simply overrides the temporary register and sends the number 1 to the ALU as input B instead. REVERSED COMPONENTS - The Instruction Register and the Instruction Address Register are in opposite positions in the diagrams used in the book. They are reversed in the video because the internal wiring of the control unit will be introduced in a subsequent video and keeping these registers in their original positions made that design process more difficult. OP CODE WIRING - The wires used by the control unit to tell the ALU what type of operation to perform appear near the bottom of the ALU in the video, but near the top of the ALU in the book. They were reversed for a similar reason as the one listed above. The wiring of the ALU will be introduced in a subsequent video and keeping these wires at the top of the ALU made the design process more difficult.
How Amateurs created the world’s most popular Processor (History of ARM Part 1)
18:11

How Amateurs created the world’s most popular Processor (History of ARM Part 1)

Bonus videos and a Nebula discount: https://go.nebula.tv/lowspecgamer A new computer company based in the UK is looking for talent and stumbles upon the most popular microprocessor ever created. Events slightly adjusted or exaggerated for narrative (or dramatic) purpose. Sidequest bonus content: How the BBC Micro failed in America: https://nebula.tv/videos/lowspecgamer-how-the-bbc-micro-failed-in-america Interview with Steve Furber: https://nebula.tv/videos/lowspecgamer-full-interview-with-steve-furber Social media: https://twitter.com/lowspec_gamer https://www.instagram.com/thelowspecgamer Credits Research and Writing: LowSpecAlex Voice over: LowSpecAlex Editing: Zave Davey, LowSpecAlex Audio Editing: Susmit Gupta 3D animation: Windy, Divye Art by Maiku No Koe: https://twitter.com/maiku_no_koe Spanish Translation, Audio editing and revision: Henrique von Buren Camera work: Victor Candela, F4mi and LowSpecAlex Dubbed by: https://twitter.com/JesusHDoblaje Thumbnail design: Maiku no Koe Special VA guest: @DanStormVO @MedlifeCrisis https://www.youtube.com/channel/UC2PA-AKmVpU6NKCGtZq_rKQ Music by Epidemic Sound: http://epidemicsound.com/creator Stock Footage from Getty Sources: http://archive.computerhistory.org/resources/access/text/2015/04/102739951-05-01-acc.pdf https://archive.computerhistory.org/resources/access/text/2012/05/102746196-05-01-acc.pdf https://archive.computerhistory.org/resources/access/text/2012/06/102746190-05-01-acc.pdf
bottom of page