For a dataset like the Diabetes one you can use a much simpler network. Try to reduce the neurons in your second layer. (Is there a specific reason why you chose tanh as the activation there?).
In addition you simply can add an EarlyStopping callback to your training: https://keras.io/callbacks/