fpga

CUDA or FPGA for special purpose 3D graphics computations? [closed]

人盡茶涼 提交于 2019-12-29 10:08:53
问题 Closed . This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing this post. Closed 5 months ago . I am developing a product with heavy 3D graphics computations, to a large extent closest point and range searches . Some hardware optimization would be useful. While I know little about this, my boss (who has no software experience) advocates FPGA (because it can be tailored

GPU相比FPGA更具优势,适应AI快速变化需求

天涯浪子 提交于 2019-12-29 08:04:15
  <strong>飞象网讯</strong>(源初/文)上周,在 GTC19 大会期间,NVIDIA 加速计算产品管理总监 Paresh Kharya 对关于 GPU 相比 FPGA 的优势的问题时回答表示,GPU 在可编程上具备明显优势,整个开发时间更短。   他表示称目前做好一个 FPGA,整个编程时间就要几个月,而且还要在硬件层面对它进行编程。然而现在 AI 变化速度非常快,甚至更新是以分钟来计算的,所以必须要在软件端实现高度灵活的可编程。恰恰 GPU 是 AI 领域的专用芯片,他的指令集是非常有优势的,是全可编程,并且是软件定义的。   GPU 的另一个优势在于架构向前兼容,如果未来需要使用新硬件,可以使开发周期得到大大缩短,整个硬件可以随着软件不断更新适应,在软件库中可直接进行更新。同时 NVIDIA 的平台可以在任何设备上使用,包括台式机、笔记本、服务器、数据中心、边缘和物联网。   而在 GTC19 大会的主题演讲中,黄仁勋也不断强调了 GPU 在各种应用领域中相比于 CPU 的明显优势,例如阿里的边缘系统在 GPU 上跑,每秒可以做 780 次查询,但是如果说用 CPU,每秒只能做 3 次查询;而在百度采用 NVIDIA AI 推荐系统的表现来看,以往对于百度庞大的用户潜在兴趣数据包的模型训练在 CPU 上成本高昂且速度慢,而 GPU 训练成本只有 CPU 的1

FPGA与PCI-E

本小妞迷上赌 提交于 2019-12-28 16:16:03
从并行到串行: PCI Express(又称PCIe)是一种高性能、高带宽串行通讯互连标准,取代了基于总线的通信架构,如:PCI、PCI Extended (PCI-X) 以及加速图形端口(AGP)。 PCI-e的主要性能: 更低的生产成本 更高系统吞吐量 更好可扩展性和灵活性 上述传统基于总线的互连几乎根本无法达到PCI-e所拥有的优秀性能。 PCI Express标准的制定是着眼未来的,它还在继续发展为系统提供更高的吞吐量。第一代PCIe约定的吞吐量是2.5千兆位/秒(Gbps),第二代则达到5.0Gbps,而最近发布的PCIe3.0标准则能支持8.0Gbps的速率。在PCIe标准继续利用最新的技术以提供不断增加的吞吐量的同时,利用分层协议、通过使驱动程序保持与现有PCI应用的软件兼容性将简化从PCI到PCIe的过渡。 虽然最初定位在电脑扩展卡和图形卡应用,但目前,PCIe已在更广泛的领域得到应用,包括:网络、通信、存储、工业和消费类电子产品等。 这里对PCI-e的详细协议不做介绍,只从整体上介绍PCI-e的概述、PCI-e的优势以及FPGA实现PCI-e的优势。 PCIe的优势以其复杂性为代价。PCIe是基于分组的串行连接协议,估计比PCI并行总线复杂10倍以上。这种复杂性部分源于在千兆赫速率所要求的并行到串行的数据转换以及转向基于分组的实现。 PCI与PCI-e接口

创龙Xilinx Artix-7系列FPGA开发板的拨码开关、拓展IO

心已入冬 提交于 2019-12-27 17:48:12
【推荐】2019 Java 开发者跳槽指南.pdf(吐血整理) >>> TLA7-EasyEVM开发板是一款由广州创龙基于Xilinx Artix-7系列FPGA自主研发的核心板+底板方式的开发板,可快速评估FPGA性能。核心板尺寸仅70mm x 50mm,采用沉金无铅工艺的1 0 层板设计,专业的PCB Layout保证信号完整性的同时,经过严格的质量控制,满足工业环境应用。 SOM-TLA7核心板引出CPU全部资源信号引脚,二次开发极其容易,客户只需要专注上层应用,大大降低了开发难度和时间成本,让产品快速上市,及时抢占市场先机。不仅提供丰富的Demo程序,还提供详细的开发教程,全面的技术支持,协助客户进行底板设计、调试以及软件开发。 拨码 开关 SW 1 设有 2 位启动选择开关,如下图方向放置, 开关向 ON 的方向拨为 1 ,反之为0, 硬件及引脚定义如下图 : 拓展IO 开发板上引出2个48pin 欧式连接器 (CON10 ) 和(CON11 ) ,引出GPIO等 拓展信号 ;1个50pin牛角座连接器(J 1) 引出G PIO 等拓展信号 , 硬件及引脚定义如下图 : 来源: oschina 链接: https://my.oschina.net/u/4169033/blog/3148284

How can i generate a pulse train to give output in common way?

落花浮王杯 提交于 2019-12-25 18:44:41
问题 I am working on generating a 40 bit length pulse train. I also must be able to adjust the frequency. I tried to make a new low frequency clock and i make a new counter which counts on it's rising edges and give an high output and terminating after 40 bit. It's not working. I tried some other methods. They are not, too. For example; library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.NUMERIC_STD.all; entity con40 is port(clk:in std_ulogic; q:out std_ulogic); end entity con40; architecture

How to use the Xilinx Division IP Core

回眸只為那壹抹淺笑 提交于 2019-12-25 18:32:42
问题 I am writing code in VHDL to be synthesised onto a XilinX FPGA. I typically use GHDL to simulate my testbenches. I need to make use of the XilinX division core in order to divide by a variable however I am not sure how to do this as there appear to be no examples in the XilinX documentation. Do I have to use the XilinX software to generate the VHDL component for the divider? Or does XilinX implicitly understand that divider means using the IP core? If my 2nd statement is true how would I go

Weird VHDL Behavior

独自空忆成欢 提交于 2019-12-25 07:30:47
问题 In the following VHDL code when i use logical or the code stops working the HD44780LCD crashes but when i remove the logical or and remove one of the holders the code starts to work again. I'm using Xilinx Spartan 3E starter board. In other words when I replace the SendCommand <= Holder(0); with SendCommand <= Holder(0) or Holder(1); The program acts weird and crashes. Here is the code: library IEEE; use IEEE.STD_LOGIC_1164.all; use IEEE.STD_LOGIC_ARITH.all; use IEEE.STD_LOGIC_UNSIGNED.all;

FPGA实现任意奇数分频

天大地大妈咪最大 提交于 2019-12-25 06:30:22
FPGA实现任意奇数分频 奇数分频原理分析 Verilog代码实现 仿真验证 奇数分频原理分析 我们都知道用FPGA做偶数分频很简单,只需要用计数器计数到分频系数N的一半再减去1,不断去翻转分频的信号即可得到分频的信号,那么奇数奇数分频其实也是一样的,但是如果要得到占空比为50%的信号,那可能就需要处理一下才可以,下图为占空比为50%的3分频信号产生原理。 Verilog代码实现 module div_clk#( parameter DIV_MAX = 3 ) ( input clk, input rst_n, output clk_div ); reg [7:0] cnt_pos; reg clk_div_p; reg [7:0] cnt_neg; reg clk_div_n; // 利用原始信号的上升沿产生一个三分频的信号,此时占空比不是50% always @(posedge clk or negedge rst_n) begin if(!rst_n) begin cnt_pos <= 8'd0; clk_div_p <= 1'd0; end else if(cnt_pos == DIV_MAX/2 - 1) begin cnt_pos <= cnt_pos + 1'b1; clk_div_p <= 1'b0; end else if(cnt_pos == DIV_MAX -

Too many comps of type “BUFGMUX” found to fit this device. (Ethernet Design)

不羁的心 提交于 2019-12-25 04:54:28
问题 I'm designing an Ethernet MAC Controller for Spartan 3E FPGA. IOBs have reached 109%. I still proceeded with the generation of bitstream. I then encountered this error: Too many comps of type "BUFGMUX" found to fit this device. What does this mean? (I'm pretty sure that running the Spartan 3e can run the Ethernet since there is already an IP of Ethernet lite MAC for Spartan 3e. Also, it has more pins than I have in my module. Why does it have then 109% of IOBs?) I also tried commenting the

finding power of a number

…衆ロ難τιáo~ 提交于 2019-12-25 04:24:56
问题 I have a very big number which is a product of several small primes. I know the number and also I know the prime factors but I don't know their powers. for example: (2^a)x(3^b)x(5^c)x(7^d)x(11^e)x .. = 2310 Now I want to recover the exponents in a very fast and efficient manner. I want to implement it in an FPGA. Regards, 回答1: The issue is that you are doing a linear search for the right power when you should be doing a binary search. Below is an example showing how to the case where the