high-availability

高可用性(High Availability):Redis 哨兵是Redis官方的高可用性解决方案

岁酱吖の 提交于 2019-12-05 12:07:54
Redis 的 哨兵(Sentinel) Redis 的 Sentinel 系统用于管理多个 Redis 服务器(instance), 该系统执行以下三个任务: 监控 :哨兵会不断检查你的主服务器和从服务器是否运作正常。 提醒 :当被监控的某个Redis服务器出现问题时,哨兵可以通过API给程序员发送通知 自动故障转移 :主服务器宕机,哨兵会开始一次自动故障转移操作,升级一个从服务器为主服务器,并让其他从服务器改为复制新的主服务器;当客户端试图连接失效的主服务器时,集群也会向客户端返回新的主服务器地址,使得集群可以新的主服务器代替失效服务器。 注: Redis的哨兵是一个分布式系统 ,你可以在架构中运行多个哨兵进程,这些进程使用 gossip 协议 来接受主服务器是否下线的信息,并使用 投票协议 来决定是否执行故障转移,以及选择哪个从服务器作为新的主服务器 虽然 Redis 哨兵有一个单独的可执行文件 redis-sentinel , 但实际上它只是一个运行 在特殊模式下的 Redis 服务器 , 你可以在启动一个普通 Redis 服务器时通过给定 –sentinel 选项来启动 Redis 哨兵。 启动 Sentinel 对于 redis-sentinel 程序, 你可以用以下命令来启动 哨兵系统: 对于 redis-server 程序, 你可以用以下命令来启动一个运行在

High Availability With QJM详细部署步骤

血红的双手。 提交于 2019-12-05 08:19:48
hadoop集群搭建: 配置hosts (4个节点一致) 192.168.83.11 hd1 192.168.83.22 hd2 192.168.83.33 hd3 192.168.83.44 hd4 配置主机名(重启生效) [hadoop@hd1 ~]$ more /etc/sysconfig/network NETWORKING=yes HOSTNAME=hd1 配置用户用户组 [hadoop@hd1 ~]$ id hadoop uid=1001(hadoop) gid=10010(hadoop) groups=10010(hadoop) 配置JDK [hadoop@hd1 ~]$ env|grep JAVA JAVA_HOME=/usr/java/jdk1.8.0_11 [hadoop@hd1 ~]$ java -version java version "1.8.0_11" Java(TM) SE Runtime Environment (build 1.8.0_11-b12) Java HotSpot(TM) 64-Bit Server VM (build 25.11-b03, mixed mode) 配置ssh免密登录 ssh-keygen -t rsa ssh-keygen -t dsa cat ~/.ssh/*.pub > ~/.ssh/authorizedkeys

Solr safe dataimport and core swap on high-traffic website

你说的曾经没有我的故事 提交于 2019-12-04 18:54:28
问题 Hello fellow technicians, Let's assume we have a (PHP) website with millions of visitors a month and we running a SolR index on the website with 4 million documents hosted. Solr is running on 4 separate servers where one server is the master and other 3 servers are replicated. There can be inserted thousands of documents into Solr every 5 minutes. And besides that, user can update their account which also should trigger a solr update. I am looking for a safe strategy to rebuild the index fast

High Availability With QJM

北战南征 提交于 2019-12-04 08:23:57
节点及实例规划: High Availability With QJM 部署要点及注意事项请参考 https://my.oschina.net/u/3862440/blog/2208568 HA 部署小节。 编辑"hdfs-site.xml" dfs.nameservices --配置命名服务,一个集群一个服务名,服务名下面包含多个服务和几点,对外统一提供服务。 <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> dfs.ha.namenodes.[nameservice ID] --配置所有的NN的service id,一个service服务下面有多个NN节点,为了做NN高可用,集群必须知道每个节点的ID,以便区分。我这里规划了2个NN,所以NN ID有两个。 <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property> dfs.namenode.rpc-address.[nameservice ID].[name node ID] --配置所有的NN的rpc协议(用于NN与DN或者客户端之间的数据传输),我这里配置了2个NN 3个DN

How to hand-over a TCP listening socket with minimal downtime?

試著忘記壹切 提交于 2019-12-04 03:12:47
While this question is tagged EventMachine, generic BSD-socket solutions in any language are much appreciated too. Some background: I have an application listening on a TCP socket. It is started and shut down with a regular System V style init script. My problem is that it needs some time to start up before it is ready to service the TCP socket. It's not too long, perhaps only 5 seconds, but that's 5 seconds too long when a restart needs to be performed during a workday. It's also crucial that existing connections remain open and are finished normally. Reasons for a restart of the application

Why are RDBMS considered Available (CA) for CAP Theorem

走远了吗. 提交于 2019-12-04 01:52:57
If I understand the CAP Theorem correctly, availability means that the cluster continues to operate even if a node goes down. I've seen a lot of people ( http://blog.nahurst.com/tag/guide ) list RDBMS as CA, but I do not understand how RBDMS is available, as if a node goes down, the cluster must go down to maintain consistency. My only possible answer to this has been that most RDBMS are a single node, so there is no "non-failing" node. But, this seems to be a technicality, not true 'availability' and definitely not high availability. Thank you. First of all, let me clarify and state that the

NameNode HA when using hdfs:// URI

感情迁移 提交于 2019-12-03 14:04:42
With HDFS or HFTP URI scheme (e.g. hdfs://namenode/path/to/file ) I can access HDFS clusters without requiring their XML configuration files. It is very handy when running shell commands like hdfs dfs -get , hadoop distcp or reading files from Spark like sc.hadoopFile() , because I don't have to copy and manage xml files for all relevant HDFS clusters to all nodes that those codes might potentially run. One drawback of this approach is that I have to use the active NameNode's hostname, otherwise Hadoop will throw an exception complaining that the NN is standby. A usual workaround is to try one

Solr safe dataimport and core swap on high-traffic website

本秂侑毒 提交于 2019-12-03 12:21:31
Hello fellow technicians, Let's assume we have a (PHP) website with millions of visitors a month and we running a SolR index on the website with 4 million documents hosted. Solr is running on 4 separate servers where one server is the master and other 3 servers are replicated. There can be inserted thousands of documents into Solr every 5 minutes. And besides that, user can update their account which also should trigger a solr update. I am looking for a safe strategy to rebuild the index fast and safe without missing any document. And to have a safe delta/update strategy. I have thought about

Failover & Disaster Recovery [closed]

我的梦境 提交于 2019-12-03 11:50:46
问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 5 years ago . What's the difference between failover and disaster recovery? 回答1: Failover: When one machine fails, another machine (usually in the same location) takes over and resumes service Disaster recovery: When Godzilla destroys your data center, you do have alternative locations to keep

How to setup Jenkins with HA?

谁都会走 提交于 2019-12-03 11:14:43
问题 Currently we are using a Jenkins as our CI system and there is one master server and slaves which are provisioned by Saltstack on Openstack. If our Jenkins master server goes down, we need to create a new master and we need to pull the files from the old master & put it in new ones but it's gonna take at least 30mins. Is there any way to setup Jenkins with High Availability? I already check with Gearman Plugin, however if the Gearman server goes down for some reason, we need to setup a HA for