备案 控制台
开发者社区 云计算 文章 正文

HBM内存介绍

简介: 原帖地址:http://www.anandtech.com/show/9969/jedec-publishes-hbm2-specification The high-bandwidth memory (HBM) technology solves two key problems rel...

原帖地址: http://www.anandtech.com/show/9969/jedec-publishes-hbm2-specification

The high-bandwidth memory (HBM) technology solves two key problems related to modern DRAM: it substantially increases bandwidth available to computing devices (e.g., GPUs) and reduces power consumption. The first-generation HBM has a number of limitations when it comes to capacity and clock-rates. However, the second-gen HBM promises to eliminate them.

JEDEC, a major semiconductor engineering trade organization that sets standards for DRAM, recently published the final specifications of the second-generation HBM (HBM2), which means that members of the organization had ratified the standard. The new memory technology builds upon the foundation of the original JESD235 standard, which describes stacked memory devices interconnected using through silicon vias (TSVs) with a very wide input/output (I/O) interface operating at moderate data-rates. The JESD235A will help engineers to further increase performance, capacity and capabilities of HBM memory chips. HBM Gen 2 will be particularly useful for the upcoming video cards by AMD and NVIDIA, which thanks to HBM2 can feature as much as 512 GB/s – 1 TB/s of memory bandwidth and 8, 16 or even 32 GB of memory onboard.

HBM Gen 1: Good, But With Limitations

The original JESD235 standard defines the first-generation HBM (HBM1) memory chips with a 1024-bit interface and up to 1 Gb/s data-rate, which stack two, four or eight DRAM devices with two 128-bit channels per device on a base logic die. Each HBM stack (which is also called KGSD — known good stacked die) supports up to eight 128-bit channels because its physical interface is limited to 1024 bits. Every channel is essentially a 128-bit DDR interface with 2n prefetch architecture (256 bits per memory read and write access) that has its own DRAM banks (8 or 16 banks, depending on density), command and data interface, clock-rate, timings, etc. Each channel can work independently from other channels in the stack or even within one DRAM die. HBM stacks use passive silicon interposers to connect to host processors (e.g., GPUs). For more information about HBM check out our article called “ AMD Dives Deep On High Bandwidth Memory — What Will HBM Bring AMD?”.

HBM gen 1 memory KGSDs produced by SK Hynix (the only company that makes them commercially) stack four 2 Gb memory dies and operate at 1 Gb/s data rate per pin. AMD uses these KGSDs with 1 GB capacity and 128 GB/s peak bandwidth per stack to build its Fiji GPU system-in-packages (SiPs) and the Radeon R9 Fury/R9 Nano video cards. The graphics adapters have 4 GB of VRAM onboard, not a lot for 2016. While AMD’s flagship video cards do not seem to have capacity issues right now, 4 GB of memory per graphics adapter is a limitation. AMD’s latest graphics cards sport 512 GB/s of memory bandwidth, a massive amount by today’s standards, but even that amount could be a constraint for future high-end GPUs.

HBM Gen 2: Good Thing Gets Better

The second-generation HBM (HBM2) technology, which is outlined by the JESD235A standard, inherits physical 128-bit DDR interface with 2n prefetch architecture, internal organization, 1024-bit input/output, 1.2 V I/O and core voltages as well as all the crucial parts of the original tech. Just like the predecessor, HBM2 supports two, four or eight DRAM devices on a base logic die (2Hi, 4Hi, 8Hi stacks) per KGSD. HBM Gen 2 expands capacity of DRAM devices within a stack to 8 Gb and increases supported data-rates up to 1.6 Gb/s or even to 2 Gb/s per pin. In addition, the new technology brings an important improvement to maximize actual bandwidth.

One of the key enhancements of HBM2 is its Pseudo Channel mode, which divides a channel into two individual sub-channels of 64 bit I/O each, providing 128-bit prefetch per memory read and write access for each one. Pseudo channels operate at the same clock-rate, they share row and column command bus as well as CK and CKE inputs. However, they have separated banks, they decode and execute commands individually. SK Hynix says that the Pseudo Channel mode optimizes memory accesses and lowers latency, which results in higher effective bandwidth.

If, for some reason, an ASIC developer believes that Pseudo Channel mode is not optimal for their product, then HBM2 chips can also work in Legacy mode. While memory makers expect HBM2 to deliver higher effective bandwidth than predecessors, it depends on developers of memory controllers how efficient next-generation memory sub-systems will be. In any case, we will need to test actual hardware before we can confirm that HBM2 is better than HBM1 at the same clock-rate.

Additional improvements of HBM2 over the first-gen HBM includes lane remapping modes for hard and soft repair of lanes (HBM1 supports various DRAM cell test and repair techniques to improve yields of stacks, but not lane remapping), anti-overheating protection (KGSD can alert memory controllers of unsafe temperatures) and some other.

The second-generation HBM memory will be produced using newer manufacturing technologies than the first-gen HBM. For example, SK Hynix uses its 29nm process to make DRAM dies for its HBM1 stacks. For HBM2 memory, the company intends to use their 21nm process. Thanks to newer manufacturing technologies and higher effective bandwidth, HBM2 should have higher energy efficiency than HBM1 at its data-rates, but we do not have exact details at this point. In any case, HBM2 is likely to be more energy efficient than GDDR5 and GDDR5X, hence the odds are good that it will be the memory of choice for high-end graphics cards in the future.

Samsung Electronics this week said that it had begun mass production of HBM2 memory, but did not reveal too many details. Samsung's HBM2 KGSD features 4 GB capacity, 2 Gb/s data rate per pin and is based on four 8 Gb DRAM dies. The memory chips will let device manufacturers build SiPs with up to 16 GB of memory. It is noteworthy that Samsung decided to use 8 Gb DRAM dies for its HBM2 stacks. Such decision looks quite logical since with 8 Gb DRAM ICs the company can relatively easily increase or decrease capacity of its KGSDs by altering the number of DRAM layers. The DRAM maker uses its 20nm process to produce its HBM2 DRAM KGSDs. Unfortunately, Samsung did not reveal actual power consumption of the new memory stacks.

 

image

HBM2 memory stacks are not only faster and more capacious than HBM1 KGSDs, but they are also larger. SK Hynix’s HBM1 package has dimensions of 5.48 mm × 7.29 mm (39.94 mm2). The company’s HBM2 chip will have dimensions of 7.75 mm × 11.87 mm (91.99 mm2). Besides, HBM2 stacks will also be higher (0.695 mm/0.72 mm/0.745 mm vs. 0.49 mm) than HBM1 KGSDs, which may require developers of ASICs (e.g., GPUs) to install a heat-spreader on their SiPs to compensate for any differences in height between the memory stacks and GPU die, to protect the DRAM, and to guarantee sufficient cooling for high bandwidth memory.

Larger footprint of the second-gen HBM2 means that the upcoming SiPs with multiple memory stacks will require larger silicon interposers, which means that they are going to be slightly more expensive than SiPs based on the first-gen HBM. Since geometric parameters of staggered microbump pattern of HBM1 and HBM2 are the same, complexity of passive silicon interposers will remain the same for both types of memory. A good news is that to enable 512 GB/s of bandwidth, only two HBM2 stacks are needed, which implies that from bandwidth per mm2 point of view the new memory tech continues to be very efficient.


A slide by FormFactor and Teradyne from their presentation at Semiconductor Wafer Test Workshop 2015

Since SK Hynix’s HBM1 KGSDs are smaller than the company’s HBM2 stacks, they are going to have an advantage over the second-gen high-bandwidth memory for small form-factor SiPs. As a result, the South Korea-based DRAM maker may retain production of its HBM1 chips for some time.

New Use Cases and Industry Support

Thanks to higher capacity and data-rates, HBM2 memory stacks will be pretty flexible when it comes to configurations. For example, it will be possible to build a 2 GB KGSD with 256 GB/s of bandwidth that will use only two 8 Gb memory dies. Such memory stack could be used for graphics adapters designed for notebooks or ultra-small personal computers. Besides, it could be used as an external cache for a hybrid microprocessor with built-in graphics (in the same manner as Intel uses its eDRAM cache to boost performance of its integrated graphics processors). What remains to be seen is the cost of HBM2 stacks that deliver 256 GB/s bandwidth. If HBM2 and the necessary interposer remains as expensive as HBM1, it will likely continue to only be used for premium solutions.

Thanks to a variety of KGSD configurations prepared by DRAM manufacturers, expect new types of devices to start using HBM2. Samsung and SK Hynix believe that in addition to graphics and HPC (high-performance computing) cards, various server, networking and other applications will utilize the new type of memory. As of September, 2015, more than 10 companies were developing system-on-chips (including ASICs, x86 processors, ASSPs and FPGAs) with HBM support, according to SK Hynix.

The first-generation HBM memory delivers great bandwidth and energy efficiency, but it is produced by only one maker of DRAM and is not widely supported by developers of various ASICs. By contrast, Samsung Electronics and SK Hynix, two companies that control well over 50% of the global DRAM output, will make HBM2. Micron Technology yet has to confirm its plans to build HBM2, but since this is an industry-standard type of memory, the door is open if the company wishes to produce it.

Overall, the industry support for the high bandwidth memory technology is growing. There are 10 companies working on SoCs with HBM support, leading DRAM makers are gearing up to produce HBM2. The potential of the second-gen HBM seems to be rather high, but the costs remain a major concern. Regardless, it will be extremely interesting to see next-generation graphics cards from AMD and NVIDIA featuring HBM2 DRAM and find out what they are capable of because of the new Polaris and Pascal architectures as well as the new type of memory.

相关实践学习
基于阿里云DeepGPU实例,用AI画唯美国风少女
本实验基于阿里云DeepGPU实例,使用aiacctorch加速stable-diffusion-webui,用AI画唯美国风少女,可提升性能至高至原性能的2.6倍。
迈克老狼1
目录
相关文章
Hacoj
|
1天前
|
存储
CPU的内存分页
CPU的内存分页是一种内存管理机制,旨在优化内存的使用效率和程序的运行效率。在现代计算机系统中,整个虚拟和物理内存空间被切割成固定大小的块,称为页(Page)和帧(Frame)。页用于虚拟地址空间,而帧用于物理内存空间。这些页和帧的大小通常是固定的,比如常见的4KB。 CPU通过内存管理单元(MMU)来实现虚拟地址到物理地址的转换。这个转换过程是通过页表来完成的,页表存储在内存中,并保存了页号与页帧号的映射关系。当CPU需要访问某个虚拟地址时,它会查阅页表,找到对应的物理地址,然后完成内存访问。 内存分页的主要好处有以下几点: 1. **减小换入换出的粒度**:内存分页允许操作系统以更小
Hacoj
18 0
穆雄雄.
|
1天前
|
Java 数据库连接
Hibernate中使用Criteria查询及注解——(Emp.hbm.xml)
Hibernate中使用Criteria查询及注解——(Emp.hbm.xml)
穆雄雄.
12 2
流烟默
|
1天前
|
SQL Java 关系型数据库
Hibernate - 对象关系映射文件(*.hbm.xml)详解
Hibernate - 对象关系映射文件(*.hbm.xml)详解
流烟默
122 1
流烟默
|
1天前
|
XML 缓存 Java
你尝试过在mybatis某个mapper上同时配置<cache/>和<cache-ref/>吗?
你尝试过在mybatis某个mapper上同时配置<cache/>和<cache-ref/>吗?
流烟默
46 0
快乐阿超
|
5月前
|
XML JSON fastjson
mybatis-plus字段类型处理器
mybatis-plus字段类型处理器
快乐阿超
48 0
穆雄雄.
最全三大框架整合(使用映射)——Emp.hbm.xml
最全三大框架整合(使用映射)——Emp.hbm.xml
穆雄雄.
40 0
穆雄雄.
最全三大框架整合(使用映射)——Dept.hbm.xml
最全三大框架整合(使用映射)——Dept.hbm.xml
穆雄雄.
41 0
穆雄雄.
|
Java 数据库连接
hibernate配置详情2(Dept.hbm.xml)
hibernate配置详情2(Dept.hbm.xml)
穆雄雄.
48 0
Chengyunlai
|
Java 数据库连接 数据库
Mybatis的typeHandlers类型处理器的使用
Java类型和JDBC类型的转换工作由类型处理器来完成。接口typeHandlers,用于处理Java类型和JDBC类型之间的转换,本文介绍的是它的一个抽象类:BaseTypeHandler。
Chengyunlai
177 0
开发工程师
|
Java 数据库连接 程序员
详解 MyBatis 类型处理器,让你的代码更优雅!
详解 MyBatis 类型处理器,让你的代码更优雅!
开发工程师
174 0
详解 MyBatis 类型处理器,让你的代码更优雅!

热门文章

最新文章

  • 1
    MySQL数据库重命名的方法
  • 2
    CVE-2017-9805:Struts2 REST插件远程执行命令漏洞(S2-052) 分析报告
  • 3
    阿里云播放器SDK的正确打开方式 | 功能、架构与应用(一)
  • 4
    流言终结者- Flutter和RN谁才是更好的跨端开发方案?
  • 5
    Linux 多核下绑定硬件中断到不同 CPU(IRQ Affinity)
  • 6
    PostgreSQL 聚合函数讲解 - 3 总体|样本 方差, 标准方差
  • 7
    利用Serverless Kubernetes和Kaniko快速自动化构建容器镜像
  • 8
    袋鼠云数据中台专栏(五):数栈,企业级一站式数据中台PaaS
  • 9
    企业微信开发(二):API对接及Demo程序
  • 10
    一篇文章深入浅出带你了解mybatis
  • 1
    如果使用已经到达 End of life 的 Angular 版本会遇到什么问题
    21
  • 2
    印刷文字识别产品使用合集之部署失败如何解决
    21
  • 3
    印刷文字识别产品使用合集之API接口无法调用如何解决
    21
  • 4
    什么是计算机图形领域的 color saturation
    25
  • 5
    mvp架构
    21
  • 6
    dex、vdex、.odex与.oat
    26
  • 7
    java调用kotlin代码编译报错“找不到符号”的问题
    16
  • 8
    印刷文字识别产品使用合集之身份证识别接口有哪些
    18
  • 9
    Build was configured to prefer settings repositories over project repositories but repository
    16
  • 10
    Java注解之编译时注解
    16
  • 相关电子书

    更多
  • 低代码开发师(初级)实战教程
  • 冬季实战营第三期:MySQL数据库进阶实战
  • 阿里巴巴DevOps 最佳实践手册
  • 下一篇
    2024年阿里云免费云服务器及学生云服务器申请教程参考

    深圳SEO优化公司湖州百姓网标王推广诸城网站seo优化多少钱玉林网站定制多少钱飞来峡网站设计多少钱宝鸡网站推广推荐河源网站seo优化多少钱同乐网站推广系统推荐娄底百度爱采购哪家好荆门百度标王推荐本溪网站搜索优化推荐开封网站定制多少钱霍邱百度关键词包年推广推荐运城网站关键词优化推荐民治SEO按天扣费沙井外贸网站设计多少钱天门网站优化按天扣费多少钱来宾百度关键词包年推广醴陵关键词排名包年推广推荐贵阳营销型网站建设推荐绍兴建网站公司郑州外贸网站制作价格日照seo报价濮阳seo网站优化报价阜新设计公司网站报价扬州网站关键词优化多少钱泰州设计公司网站报价景德镇网页设计西乡百度网站优化排名吕梁网站搜索优化价格定西SEO按天计费歼20紧急升空逼退外机英媒称团队夜以继日筹划王妃复出草木蔓发 春山在望成都发生巨响 当地回应60岁老人炒菠菜未焯水致肾病恶化男子涉嫌走私被判11年却一天牢没坐劳斯莱斯右转逼停直行车网传落水者说“没让你救”系谣言广东通报13岁男孩性侵女童不予立案贵州小伙回应在美国卖三蹦子火了淀粉肠小王子日销售额涨超10倍有个姐真把千机伞做出来了近3万元金手镯仅含足金十克呼北高速交通事故已致14人死亡杨洋拄拐现身医院国产伟哥去年销售近13亿男子给前妻转账 现任妻子起诉要回新基金只募集到26元还是员工自购男孩疑遭霸凌 家长讨说法被踢出群充个话费竟沦为间接洗钱工具新的一天从800个哈欠开始单亲妈妈陷入热恋 14岁儿子报警#春分立蛋大挑战#中国投资客涌入日本东京买房两大学生合买彩票中奖一人不认账新加坡主帅:唯一目标击败中国队月嫂回应掌掴婴儿是在赶虫子19岁小伙救下5人后溺亡 多方发声清明节放假3天调休1天张家界的山上“长”满了韩国人?开封王婆为何火了主播靠辱骂母亲走红被批捕封号代拍被何赛飞拿着魔杖追着打阿根廷将发行1万与2万面值的纸币库克现身上海为江西彩礼“减负”的“试婚人”因自嘲式简历走红的教授更新简介殡仪馆花卉高于市场价3倍还重复用网友称在豆瓣酱里吃出老鼠头315晚会后胖东来又人满为患了网友建议重庆地铁不准乘客携带菜筐特朗普谈“凯特王妃P图照”罗斯否认插足凯特王妃婚姻青海通报栏杆断裂小学生跌落住进ICU恒大被罚41.75亿到底怎么缴湖南一县政协主席疑涉刑案被控制茶百道就改标签日期致歉王树国3次鞠躬告别西交大师生张立群任西安交通大学校长杨倩无缘巴黎奥运

    深圳SEO优化公司 XML地图 TXT地图 虚拟主机 SEO 网站制作 网站优化