AI makers can't agree on how to test whether their models behave responsibly, per Stanford HAI’s latest AI Index.