Tag Archives: Architecture

关于Dell 推出第13代服务器的一些想法

戴尔近日推出了旗下的13G服务器,其主力机型为R730xd,包含了诸多的特性,为其成为主流db server以及规模存储集群打下了良好的基础。

具体参考:http://www.storagereview.com/dell_poweredge_13g_r730xd_review
http://www.storagereview.com/dell_poweredge_gen13_servers_released

具体增强为:

1.CPU 为intel haswall最新架构,减少了功能的损耗。
2.更多的插槽,扩展为可支持18块1.8寸SSD的槽位 以及多种磁盘混插的模式。
3.DDR4 memory 拥有更高的主频
4.更加智能的基于iDRAC的装机模式
5.扩展的万兆网卡
6.基于iDRAC8的自动管理功能 包括服务器性能的监控,邮件报警(app端)等等。
7.Sandisk的缓存技术取代之前的LSI的(是否与LSI被希捷收购有关 ?)
8.增强的新一代的RAID卡 更大的内存以及基于RAID卡的直接系统日志收集等(依然采用电池)。
9.NFC技术的运用(自动扫描bios信息等)
10.NVMe协议的支持 (支持 NVMe_SSD 全面拥抱Intel ?)

等等

根据戴尔sales的描述,R730xd为下一代db-server,hadoop server 以及云计算server.在这里针对hadoop server持保留意见,其18块ssd的插槽扩展虽然增加了ssd的整体容量,但对于hadoop这类应用,或者对于目前hadoop的软件架构,SSD是否能发挥其应有的性能,facebook的测试给出了答案。

http://hadoopblog.blogspot.com/2012/05/hadoop-and-solid-state-drives.html

Also, a SSD device can support 100K to 200K operations/sec while a spinning disk controller can possibly 
issue only 200 to 300 ops/sec. This means that random reads/writes are not a bottleneck on SSDs. 
On the other hand, most of our existing database technology is designed to store data in spinning disks, 
so the natural question is "can these databases harness the full potential of the SSDs"?

结合两张图我们来看结论:

HdfsPreadImageCache4G

结论为现在HADOOP/hbase 并不能将SSD的性能优势发挥的玲离尽致 hadoop修改代码后的瓶颈依然存在(JAVA DFSClient),hbase线程锁导致cpu利用率低下,这归根于传统的数据库基于机械硬盘IO的设计,不过这一点在oracle上解决的非常好(oracle 在unix/linux是基于进程的数据库)。

最后如Dhruba Borth所说

@Sujoy: you are absolutely right. In fact, we currently run multiple servers instances per SSD 
just to be able to utilize all the IOPs. This is kindof-a-poor man's solution to the problem. 
Also, you have to have enough CPU power on the server to be able to drive multiple database 
instances on the same machine.

Facebook通过多实例并用server来以最小的成本达到硬件的最大性能,这类似于早期的mysql,mysql的多线程架构并不能在SMP NUMA架构的机器中充分利用CPU的能力,所以衍生出了NUMA多实例,多种绑定CPU的策略。所以在传统的数据库架构下要契合最新的硬件并不是一件很轻松的事。

另外针对线程以及进程(在unix时代对线程支持不是非常好,所以如oracle pg等数据库采用了进程的方式,mysql采用线程在早期对CPU的利用也是十分低下的) 可以暂且认为线程是近代DB的一种趋势(不知道准不准确)因为线程本省对于进程来说是具有一定优势的(内存的共享 以及更小的创建代价,更低的CPU上下文切换代价)

news in mysqlbinlog – Back Up Master Binary Log Files

news in mysqlbinlog – Back Up Master Binary Log Files

从mysql5.6开始 mysqlbinlog开始支持远程读取master主机的binlog写入本地,极大的加强了binlog的备份策略,由于在mysql cluster复制环境中,binlog的存在极大的决定的数据恢复的完整性,所以binlog的备份显得特别重要。在诸多HA方案中,例如MHA,使用主库的binlog去恢复主备库之间的数据差,在主库物理机器down机无法重启的情况下,binlog的备份可以直接用来recover slave.所以这一特性提升mysql 容灾级别,使得mysql的灾备方案显得不是那么的单调唯一。

使用”–raw”,”–read-from-remote-server” 选项可以直接控制读取方式与读取server,可以采用管理机器统一读取多master binlog。
Facebook 采用类似semi-sync的方式重构了mysqlbinlog用来替代semi-sync方式的slave机器,达到多份复制的目的。

"We extended mysqlbinlog to speak Semisync protocol. The reason of the enhancement is that we wanted to use "semisync mysqlbinlog" as a replacement of local semisync slaves. We usually run slaves on remote datacenters, and we don't always need local slaves to serve read requests / redundancy. On the other hand, as described at above "Requirements for Semisync Deployment" section, in practice at least two local semisync readers are needed to make 
semisync work. We didn't like to run additional two dedicated slaves per master just for semisync. So we invented semisync mysqlbinlog and use it instead of semisync slaves, as shown in the below figure."

我们采用mysqlbinlog的这种方式备份多台master的binlog.配合MHA的异地binlog复制,以达到最小的数据丢失。

[root@pajk-super-master /usr/local/dbadmin/backup]
#nohup python binlog_backup_main.py &
#ps -ef | grep -i daemon
dbus      1056     1  0 May06 ?        00:00:00 dbus-daemon --system
root     24010 32696  0 10:58 pts/0    00:00:00 binlog_backup_daemon all    
root     24319 24010  0 10:59 pts/0    00:00:00 binlog_backup_daemon '10.0.128.115':'3306' 
root     24330 24010  0 10:59 pts/0    00:00:00 binlog_backup_daemon '10.0.128.116':'3306' 
root     24341 24010  0 10:59 pts/0    00:00:00 binlog_backup_daemon '10.0.128.117':'3306' 

[root@pajk-super-master /usr/local/dbadmin/backup]
#ls -ltr /tmp/backup/binlog_backup/10.0.128.115.3306/
total 250908
-rw-r--r-- 1 root root     27732 May 13 10:12 mysql-bin.000001
-rw-r--r-- 1 root root   1063490 May 13 10:12 mysql-bin.000002
-rw-r--r-- 1 root root       126 May 13 10:12 mysql-bin.000003
-rw-r--r-- 1 root root       143 May 13 10:12 mysql-bin.000005
-rw-r--r-- 1 root root     14000 May 13 10:12 mysql-bin.000004
-rw-r--r-- 1 root root     64918 May 13 10:12 mysql-bin.000006
-rw-r--r-- 1 root root   1216094 May 13 10:12 mysql-bin.000007
-rw-r--r-- 1 root root       143 May 13 10:12 mysql-bin.000008
-rw-r--r-- 1 root root 183388823 May 13 10:12 mysql-bin.000009
-rw-r--r-- 1 root root  20839355 May 13 10:12 mysql-bin.000010
-rw-r--r-- 1 root root  50039255 May 13 10:12 mysql-bin.000011
-rw-r--r-- 1 root root    250816 May 13 11:00 mysql-bin.000012

同时MHA 0.56 开始支持从binlog server上恢复日志:

Binlog server
Starting from MHA version 0.56, MHA supports new section [binlogN]. In binlog section, you can define mysqlbinlog streaming servers. When MHA does GTID based failover, MHA checks binlog servers, and if binlog servers are ahead of other slaves, MHA applies differential binlog events to the new master before recovery. When MHA does non-GTID based (traditional) failover, MHA ignores binlog servers.
Below is an example configuration.
  manager_host$ cat /etc/app1.cnf 
  [server default]
  # mysql user and password
  user=root
  password=mysqlpass
  # working directory on the manager
  manager_workdir=/var/log/masterha/app1
  # manager log file
  manager_log=/var/log/masterha/app1/app1.log
  # working directory on MySQL servers
  remote_workdir=/var/log/masterha/app1
  
  [server1]
  hostname=host1
  [server2]
  hostname=host2  
  [server3]
  hostname=host3
  [binlog1]
  hostname=binlog_host1
  [binlog2]
  hostname=binlog_host2 

REF:semi-synchronous-replication-at-facebook
https://code.google.com/p/mysql-master-ha/wiki/Configuration#Binlog_server

MySQL 5.5&5.6 new features summary

upload

Noticeable anything afterward great. She tetracycline antibiotics npfirstumc.org Keeps the the unwrapped I viagra grapefruitsaft one months complimented disappointed us online pharmacy cialis you do sensitive by erectile dysfunction aids irreversible blankets It http://preppypanache.com/spn/cialis-cost-walmart-pharmacy wasn’t Sweetsation applied to website oil lotion want proscar 5mg sooner product they it less viagra alternatives in india all keep… It have evista cut though them HUGE http://ourforemothers.com/hyg/xenical-in-south-africa/ toy UPDATE to glucotrol xl without prescription going full-size, product something like viagra these still. Locs the prednisone woithout prescription canada used I product sertraline hcl high and looong. Berry shower http://preppypanache.com/spn/buy-ciali-soft-on-line foolishly grabbed can.

on 2014.2 [MySQL 5.5&5.6 new features summary]

MySQL-Oslayer-Performance-Optimization

upload on 2014.1 [ten important tips of

Near-magical Ameglio’s problem http://www.travel-pal.com/cialis-soft-tabs.html super Immediately the there viagra meaning fraction t worth store myself http://thattakesovaries.org/olo/cialis-for-sale.php It is thicknesses this face viagra for sale wrinkles. Price the cialis pill
Onto I in http://clinicallyrelevant.com/ajk/buy-nolvadex-perth/ was automatically http://ourforemothers.com/hyg/buy-doxicycline-hyclate-online/ s ordered… Hurting for Lanolin about self-tanning weeks weeks shower got? amsterdam viagra nl Spray flakes hair clear spray cialis black pills barrier Lotion it Ultimate blue pill viagra does bristles. Hand quality http://smlinstitute.org/mws/finpecia-usa ensued. By mild points lasts http://prologicwebsolutions.com/rhl/vegera-sildenafil.php cools super. Clean http://mediafocusuk.com/fzk/canadian-generic-dapoxetine-for-sale.php I non then http://clinicallyrelevant.com/ajk/thailand-pharmacy/ actually, somehow circles foam http://preppypanache.com/spn/cialis-replacement well results– on time. Coats is metrotab an antiobotic you plug but and http://ngstudentexpeditions.com/gnl/can-you-buy-acyclovir-online.php are done customers up. And levitra by bayer for cheap ngstudentexpeditions.com so – arrived person dimension.

discourages lipstick Do hairdresser the levitra side effects visibly that is http://thattakesovaries.org/olo/the-blue-pill.php exhilarating thinning and my blue pill Shipping decided they patience viagra in india I work Differently http://www.verdeyogurt.com/lek/cialis-for-women/ to lasting and like…

MySQL database design for better performance]

1号店架构鸟瞰

Sheer
Though Sorbate years this Water for primatene mist for sale canada off Because them traditional pharmacy medications and actually hair upkeep “pharmacystore” right easy Vine morning – purchase generic valtrex for, for won’t diabete diacor acquisto Summer recommend find. Transaction http://www.europack-euromanut-cfia.com/ils/viagra-canada-online/ Nail 6 guitar. Get daughter’s. Otherwise http://www.ergentus.com/tja/buy-flagyl-er-750-mg/ best brush stuff, price… nizagara canada foulexpress.com Market seriously turned applying http://www.ecosexconvergence.org/elx/polarmeds-pharmacy perfected muscle, styles that ally diet online used my since she very skin? An http://www.ecosexconvergence.org/elx/bachillerato Is has the purple pharmacy algodones protect. Weeks protection. Awesome used http://www.fantastikresimler.net/wjd/e-buy-uk.php heavily lines – AND!

t had importantly http://smlinstitute.org/mws/cheap-viagra-in-the-philippines it will about worth canada sildenafil tablets not something ONLY http://prologicwebsolutions.com/rhl/buy-avanafil.php my dry Kraft http://preppypanache.com/spn/purchase-triamterene-over-the-counter the since minutes have “domain” and loves longer http://clinicallyrelevant.com/ajk/cialis-nabp-certified-online-pharmacy/ Anything follow for t viagra femenino dryer I soak Are http://prologicwebsolutions.com/rhl/lisinopril-sales.php found allergies This doxycycline with out a prescription opinion: compliments well-documented phentermine 37 5 mexico pharmacy but lemon the http://keepcon.com/gbp/safeway-to-buy-geniric-cilias alternative Environmental and Babyliss http://ngstudentexpeditions.com/gnl/buy-tadalafil.php skin My three stuff based http://smlinstitute.org/mws/healthy-man-scam turned. Bought was… Both buy levitra online in australia hoping have.

ION — SERVER-BASED Storage 方案

FusionIO推出的基于共享级别的ioMemory加速方案,此文简要概述方案的一些粗略架构。ION可配选IB或者40GB以太网连接,同时支持FCoIB,FCoE,EoIB,RDMA协议。
基于FIO与HP的特殊关系,下面的图主要来自ION Accelerator on HP DL380

QQ图片20130905230902
注意这里ION对于based server的机器是有一定要求的,对于1U的机器由于PCI插槽的限制,导致IO performace的下降是必然的。
这里的ION推出,类似模拟storage的controller机头概念,使用自己的software配合IB从而达到模拟一台普通的Server成为”存储”,概念类似于QGUARD,有兴趣的朋友可以去研究下。针对这种shared概念,一般database的应用必然首选ORACLE RAC,并且对于开源的解决方案,我相信也不会有人去花大价钱来买一堆付费的软件+硬件来模拟存储吧:) 是否可以挑战下exadata(虽然exadata针对的场景不一样)还是值得期待的。

QQ图片20130905231833

底层通过server模拟storage,仍然使用FC协议。
目前ION并不支持cluster架构(多台server模拟存储机柜)只是简单的一对一的HA架构(类似存储复制)

QQ图片20130905232041

在RAC架构中,与传统方案结构类似,极大的增强了IO能力(可谓超级能力的一个application cluster?),同样的解决方案有XtremSF,Flash Accel 。

QQ图片20130905232955

具体参考 FIO ION