I have been reading the answers above and I would like to complement them, because I think there are some details pending:
Fully Managed System Both system can have fully managed version in the cloud. Google provides Pubsub and there are some fully managed Kafka versions out there that you can configure on the cloud and On-prem.
Cloud vs On-prem I think this is a real difference between them, because Pubsub is only offered as part of the GCP ecosystem whereas Apache Kafka you can use as a both Cloud service and On-prem service (doing the cluster configuration by yourself)
Message duplication
- With Kafka you will need to manage the offsets of the messages by yourself, using an external storage, such as, Apache Zookeeper. In that way you can track the messages read so far by the Consumers. Pubsub works using acknowledging the message, if your code doesn't acknowledge the message before the deadline, the message is sent again, that way you can avoid duplicated messages or another way to avoid is using Cloud Dataflow PubsubIO.
Retention policy Both Kafka and Pubsub have options to configure the maximum retention time, by default, I think is 7 days.
Consumers Group vs Subscriptions Be careful how you read messages in both systems. Pubsub use subscriptions, you create a subscription and then you start reading messages from that subscription. Once a message is read and acknowledge, the message for that subscription is gone. Kafka use the concept of "consumer group" and "partition", every consumer process belongs to a group and when a message is read from a specific partition, then any other consumer process which belongs to the same "consumer group" will not be able to read that message (that is because the offset eventually will increase). You can see the offset as a pointer which tells the processes which message have to read.
I think there is not a correct answer for your question, it will really depends on what you will need and the constrains you have (below are some examples of the escenarios):
If the solution must be in GCP, obviously use Google Cloud Pubsub. You will avoid all the settings efforts or pay extra for a fully automated system that Kafka requires.
If the solution should require process data in Streaming way but also needs to support Batch processing (eventually), it is a good idea to use Cloud Dataflow + Pubsub.
If the solution require to use some Spark processing, you could explore Spark Streaming (which you can configure Kafka for the stream processing)
In general, both are very solid Stream processing systems. The point which make the huge difference is that Pubsub is a cloud service attached to GCP whereas Apache Kafka can be used in both Cloud and On-prem.